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Abstract. Limiting distributions are derived for the sparse connected compo- 
nents that are present when a random graph on n vertices has approximately 
|n edges. In particular, we show that such a graph consists entirely of trees, 
unicyclic components, and bicyclic components with probability approaching 
^JJ cosh ^ 0.9325 as n oo. The limiting probability that it consists 
of trees, unicyclic components, and at most one other component is approxi- 
mately 0.9957; the limiting probability that it is planar lies between 0.987 and 
0.9998. When a random graph evolves and the number of edges passes |n, its 
components grow in cyclic complexity according to an interesting Markov process 
whose asymptotic structure is derived. The probability that there never is more 
than a single component with more edges than vertices, throughout the evolution, 
approaches 57r/18 ~ 0.8727. A "uniform" model of random graphs, which allows 
self-loops and multiple edges, is shown to lead to formulas that are substantially 
simpler than the analogous formulas for the classical random graphs of Erdos 
and Renyi. The notions of "excess" and "deficiency," which are significant char- 
acteristics of the generating function as well as of the graphs themselves, lead to 
a mathematically attractive structural theory for the uniform model. A general 
approach to the study of stopping configurations makes it possible to sharpen pre- 
viously obtained estimates in a uniform manner and often to obtain closed forms 
for the constants of interest. Empirical results are presented to complement the 
analysis, indicating the typical behavior when n is near 20000. 



0. Introduction. When edges are added at random to n initially disconnected points, 
for large n, a remarkable transition occurs when the number of edges becomes approxi- 
mately ^n. Erdos and Renyi [13] studied random graphs with n vertices and edges 
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as n — > oo, and discovered that such graphs almost surely have the following properties: If 
< 0, only small trees and "unicyclic" components are present, where a unicyclic compo- 
nent is a tree with one additional edge; moreover, the size of the largest tree component is 
{fx — ln(l + Inn + O(loglogn). If n — 0, however, the largest component has size of 

order n^/"^. And if > 0, there is a unique "giant" component whose size is of order n; in 
fact, the size of this component is asymptotically an when ^ = —a~^ ln(l — a) — 1. Thus, 
for example, a random graph with approximately n In 2 edges will have a giant component 
containing ~ vertices. 

The research that led to the present paper began in a rather curious way, as a result of 
a misunderstanding. In 1988, the students in a class taught by Richard M. Karp performed 
computer experiments in which graphs with a moderately large number of vertices were 
generated by adding one edge at a time. A rumor spread that these simulations had turned 
up a surprising fact: As each of the random graphs evolved, the story went, never once 
was there more than a single "complex" component; i.e., there never were two or more 
components present simultaneously that were neither trees nor unicyclic. Thus, the first 
connected component that acquired more edges than vertices was destined to be the giant 
component. As more edges were added, this component gradually swallowed up all of the 
others, and none of the others ever became complex before they were swallowed. 

Reports of those experiments suggested that a great simplification of the theory of 
evolving graphs might be possible. Could it be that such behavior occurs almost always, 
i.e., with probability approaching 1 as n — > oo? If so, we could hope for the existence of 
a much simpler explanation of the fact that a giant component emerges during the graph 
process, and we could devise rather simple algorithms for online graph updating that would 
take advantage of the unique-complex-component phenomenon. At that time the authors 
who began this investigation (DEK and BP) were unaware of Stepanov's posthumous 
paper [36]. We were motivated chiefly by the work of BoUobas [5], who had shown that 
a component of size > n^/^ is almost always unique once the number of edges exceeds 
in-t-2(ln?i)^/^n^/^; moreover, Bollobas proved that such a component gets approximately 
4 vertices larger when each new edge is added. His results blended nicely with the unique- 
complex-component conjecture. 

However, we soon found that the conjecture is false: There is nonzero probability 
that a graph with edges will contain several pretenders to the giant throne, and this 
probability increases when the number of edges is slightly more than |n. We also learned 
that Stepanov [36] had already obtained similar results. Thus we could not hope for a 
theory of random graphs that would be as simple as the conjecture promised. On the 
other hand, we learned that the graph evolution process does satisfy the conjecture with 
reasonably high probability; hence algorithms whose efficiency rests on the assumption of 
a unique complex component will not often be inefficient. 
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Further analysis revealed, in fact, that we must have misunderstood the initial reports 
of experimental data. The actual probability that an evolving graph never has two complex 
components approaches the limiting value Stt/IS 0.8727; therefore the rumor that got us 
started could not have been true. In fact, the computer experiments by Karp's students had 
simply reported the state of the graph when exactly |n edges were present, and at certain 
other fixed reporting times. A false impression arose because there is high probability that 
a random graph with |n edges has at most one complex component; indeed, the probability 
is 0.9957 + 0{n~^/^). More complicated configurations sometimes arise momentarily just 
after |?i edges are reached. However, the fallacious rumor of 1988 has turned out to have 
beneficial effects, because it was a significant catalyst for the discovery of some remarkably 
beautiful patterns. 

Sections 1-10 of this paper provide a basic introduction to the theory of evolving 
graphs and multigraphs, using generating functions as the principal tool. Two models 
of graph evolution are presented in section 1, the "graph process" and the "multigraph 
process." Their generating functions are introduced in section 2, and special aspects of 
those functions related to trees and cycles are discussed in section 3. Section 4 explains how 
to derive properties of a graph's more complex features by means of differential equations; 
the equations are solved for multigraphs in section 5 and for graphs in section 6. The 
resulting decomposition of multigraphs turns out to be surprisingly regular. Section 7 
explains the regularities and begins to analyze the algebraic properties of the functions 
obtained in section 5. Related results for connected graphs are discussed in section 8. 
Section 9 explains the combinatorial significance of the algebraic structure derived earlier. 
Finally, section 10 presents a quantitative lemma about the characteristics of random 
graphs near the critical point = 0, making it possible to derive exact values for many 
relevant statistics. 

Readers who cannot wait to get to the "good stuff" should skim sections 1-10 and 
move on to section 11, which begins a sequence of applications of the basic theory. The 
first step is to analyze the distribution of bicyclic components; then, in section 12, the 
same ideas are shown to yield the joint distribution of all kinds of components. The 
formulas obtained there have a simple structure suggesting that the traditional approach 
of focussing on connected components is unnecessarily complicated; we obtain a simpler 
and more symmetrical theory if we first consider the excess of edges over vertices, exclusive 
of tree components, then look at other properties like connectedness after conditioning on 
the excess. Section 13 motivates this principle, and section 14 derives the probability 
distribution of a graph's excess as it passes the critical point. These ideas help to nail 
down the probability that a graph with |n edges is planar, as shown in section 15. 

Section 16 begins the discussion of what may well be the most important notion in this 
paper; readers who have time for nothing else are encouraged to look at Figure 1, which 
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shows the initial stages of the "big bang." The evolution of a graph or multigraph passes 
through discrete transitions as the excess increases, and important aspects of those changes 
are illustrated in Figure 1; section 17 proves that this illustration represents a Markov 
process that characterizes almost all graph evolutions. The || phenomenon alluded to 
above is discussed in section 18, which establishes ||- as an upper bound for the probability 
in question. Section 19 shows that, for small n, the probability of retaining at most 
one complex component during the critical stage is in fact greater than decreasing 
monotonically with n. 

The excess of a graph is of principal importance at the critical point, but a secondary 
concept called deficiency becomes important shortly thereafter. A graph with deficiency 
is called "clean"; such graphs are obtained from 3-regular graphs by splitting edges and/or 
by attaching trees to vertices of cycles. Section 20 explains how deficiency evolves jointly 
with increasing excess. Figure 2, at the end of that section, illustrates another Markov 
process that goes on in parallel with Figure 1. Section 21 shows that most graphs stay 
clean until they have acquired approximately + n^^^ edges. Section 22 looks more 
closely at the moment a graph first becomes unclean. 

Section 23 tracks the growth of excess and deficiency as a multigraph continues to 
evolve through |n + n^/^, |n + n^/^, . . . edges. The excess and deficiency are shown 
to be approximately normally distributed about certain well-defined values. Specifically, 
when the number of edges is ^{1 + n), with n — o(l), the excess will be approximately 
|/x^n and the deficiency will be approximately |//^n. These statistics complement the 
well-known fact that the emerging giant component has almost surely grown to encompass 
approximately 2//n vertices. 

Sections 24 to 26 develop a theory of "stopping configurations," by which it is pos- 
sible to study the first occurrences of various events during a multigraph's evolution. In 
particular, an explicit formula is derived for the asymptotic distribution of the time when 
the excess first reaches a given value r. A closed formula is derived for the "first cycle 
constant" of [14]. 

Section 27 completes the discussion initiated in sections 17 and 18, by proving the || 
phenomenon as a special case of a more general result about the infinite Markov process 
in Figure 1. 

Finally, section 28 presents empirical data, showing to what extent the theory relates 
to practice when n is not too large. Section 29 discusses a number of open questions raised 
by this work. 

1. Graph evolution models. We shall consider two ways in which a random graph on 
n vertices might evolve, corresponding to sampling with and without replacement. The 
first of these, introduced implicitly in [4] and explicitly in [7, proof of Lemma 2.7] and [14], 
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turns out to be simpler to analyze and simpler to simulate by computer, therefore more 
likely to be of importance in applications to computer science: We generate ordered pairs 
(x, y) repeatedly, where 1 < x,y < n, and add the (undirected) edge x — y to the graph. 
Each ordered pair {x,y) occurs with probability so we call this the uniform model 

of random graph generation. It may also be called the multigraph process, because it can 
generate graphs with self-loops x — x, and it can also generate multiple edges. Notice 
that a self- loop x — x is generated with probability while an edge x — y with x ^ y 

is generated with probability because it can occur either as (x,y) or {y,x). 

The second evolution procedure, introduced by Erdos and Renyi [12], is called the 
permutation model or the graph process. In this case we consider all N = (2) possible 
edges X — y with x < y and introduce them in random order, with all N\ permutations 
considered equally likely. In this model there are no self-loops or multiple edges. 

A multigraph M on n labeled vertices can be defined by a symmetric n x n matrix 
of nonnegative integers ruxy, where ruxy = myx is the number of undirected edges x — y 
in G. For purposes of analysis, we shall assign a compensation factor 



to M; if m = Y^^=i Y^^=x total number of edges, the number of sequences 

(^^i) yi)(2^2) 2/2) • • • {xnuUm) that lead to M is then exactly 



(The factor 2"^ accounts for choosing either {x,y) or (y,x); the 2"^^== in the denominator 
of k{M) compensates for the case x = y. The other factor ml accounts for permutations 
of the pairs, with m^yl in k{M) to compensate for permutations between multiple edges.) 

Equation (1.2) tells us that k,{M) is a natural weighting factor for a multigraph M, 
because it corresponds to the relative frequency with which M tends to occur in applica- 
tions. For example, consider multigraphs on three vertices {1,2,3} having exactly three 
edges. The edges will form the cycle Mi = {1 — 2, 2 — 3, 3 — 1} much more often 
than they will form three identical self-loops M2 = {1 — 1, 1 — 1, 1 — 1}, when the 
multigraphs are generated in a uniform way. For if we consider the 3^ possible sequences 
{xi, yi) {x2i ^2) {xzi ya) with 1 < a;, j/ < 3, only one of these generates the latter multigraph, 
while the cyclic multigraph is obtained in 2^ 3! = 48 ways. Therefore it makes sense to 
assign weights so that k{M2) = -^k{Mi), and indeed (1.1) gives k{Mi) = 1, k(M2) = ^. 

Notice that a given multigraph M is a graph — i.e., it has no loops and no multi- 
ple edges — if and only if k{M) = 1. Notice also that if M consists of several disjoint 
components Mi, . . . , M^, with no edges between vertices of Mj and Mj for i j, we have 




(1.1) 



2"^m!K(M). 



(1.2) 



k{M) 



k{Mi) . . . K{Mk) . 



(1.3) 
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2. Generating functions. We shall use bivariate generating functions (bgf's) to study- 
labeled graphs and multigraphs and their connected components. If is a family of 
multigraphs with labeled vertices, the associated bgf is the formal power series 

where m{M) and n{M) denote the number of edges and the number of vertices of M. We 
can do many operations on such power series without regard to convergence. It follows 
from (1.2) and (2.1) that m steps of the uniform evolution model on n vertices will produce 
a multigraph in T with probability 

[«,"■."] FK.), (2.2) 

where the symbol [t(;™'2;"] denotes the coefficient of w'^z'^ in the formal power series that 
follows it. Similarly, if .F is a family of graphs with labeled vertices, the probability that 
m steps of the permutation model will produce a graph in !F is 

^ [w^z-] F{w, z), N=(^y (2.3) 
Formulas (2.2) and (2.3) are asymptotically related by the formula 



mj 2"^ m! 



/ m w? ^ /"rn\ ^ ( w? \ \ 



< m < iV, (2.4) 



which follows from Stirling's approximation. 

Incidentally, the exponential factor in (2.4) is the probability that m steps of the 
multigraph process will produce no self-loops or multiple edges. When m = |n, this 
probability is 6-^/4 + OirT^) ^ 0.472. 

When we say that the n vertices of a multigraph are "labeled," it is often convenient 
to think of the labeling as an assignment of the numbers 1 to n. But a strict numeric 
convention would require us to recompute the labels whenever vertices are removed or 
when multigraphs are combined. The actual value of a label is, in fact, irrelevant; what 
really counts is the relative order between labels. Labeled multigraphs are multigraphs 
whose vertices have been totally ordered. In this paper all graphs and multigraphs are 
assumed to be labeled, i.e., totally ordered, even when the adjective "labeled" is not 
stated. 

The bgf (2.1) is an exponential generating function in z, and the factor k{M) is 
multiplicative according to (1.3). Therefore the product of bgf's 

Fi{w, z)F2{w, z) ... Fk{w, z) 
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represents ordered A;-tuples of labeled multigraphs (Mi, M2, . . . , Mj.), each Mj being from 
family J^j. Unordered A;-tuples {Mi, . . . , M^} from a common family have the bgf 
F{w, z)^/k\, if does not include the empty multigraph. For example, the bgf for a 3- 
cycle is w^z^/3\, and the bgf for two isolated vertices is z'^/21; hence the bgf for a 3-cycle 
and two isolated vertices is {w^z^/6){z'^/2) = 10w^z^/5\. (There are 10 such graphs, one 
for each choice of the isolated points.) 

Let C{w, z) be the bgf for all connected multigraphs, and let G{w, z) be the bgf for 
the set of all multigraphs. Then we have 

e^(-'^) = ^ ^^^^ = G(w, z) (2.5) 

fc>0 

because the term C{w, zY/k\ is the bgf for multigraphs having exactly k components. Sim- 
ilarly, if C{w, z) and G{w, z) are the corresponding bgf's for graphs instead of multigraphs, 
we have 

^C{w,z) ^ Q^^^ ^) ^ (2.6) 
a well-known formula due to Riddell [32] . The bgf for all graphs is obviously 

G(«;,^) = J](l + «;)-(-i)/2-. (2.7) 

n>0 

Therefore (2.6) gives us the bgf for connected graphs, 

/ 23 
C{w, 2;) = In M + 2; + (1 + 'u;)y + (1 + w)^^ + ' ' 

2 3 
= z + w^ + {3w^ +w^)^ + -- - . (2.8) 

The bgf G(w, z) for all multigraphs can be found as follows: The coefficient of z'^/nl 
is J2 n{M)w'^^^\ summed over multigraphs M on n vertices. This is 



x=l \ ^m^a;>0 y=x+l ^m^y>0 ^ / x=l 



n—x 



Hence the desired formula is slightly simpler than (2.7): 

1 7'"' 

G{w,z) = ^e'"''''^—^. (2.9) 



n>0 
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The corresponding bgf for connected multigraphs is therefore 
C{w, z) = \nG{w, z) 

z^ 

= {l + lw + lw^ + ^y + ---)z+{w + lw^ + lw' + ---)- 

+ {3w^ + fw' + ...)^ + ... . (2.10) 

In this case the coefficient of w^z^ is ^Z^'' because the connected multigraphs with three 
edges on three vertices have total weig ht ^. (The 3-cycle has weight 1; there are 9 
multigraphs obtainable by adding a self-loop to a tree, each of weight |; and there are six 
multigraphs obtainable by doubling one edge of a tree, again weighted by |.) 

Notice that expression (2.2) is [it;"^^"'] F{w, z) / [w^z"^] G{w, z), the ratio of the weight 
of multigraphs in JF to the weight of all possible multigraphs. Similarly, expression (2.3) 
is [u;""^"] F{w, z) I \w'^z''\ G{w, z). 

It is convenient to group the terms of (2.8) and (2.10) according to the excess of edges 
over vertices in connected components. Let Cr and Cr denote the families of connected 
multigraphs and graphs in which there are exactly r more edges than vertices; let Criw, z) 
and Cr{w,z) be the corresponding bgf's. Then we have 

C{w, z) — Cr{w, z) = w'^Cr{wz) , 
r r 

d{w,z) =Y,Criw,z) = J2'^''Cr{wz) , (2.11) 
r r 

where Cr{z) and Cr-{z) are univariate generating functions for Cr and Cr- A univariate 
generating function F{z) is ^(M)^^'^!, summed over all graphs or multigraphs in a 
given family We obtain it from a bgf by setting w = 1, thereby ignoring the number of 
edges. Univariate generating functions are easier to deal with than bgf's, so we generally 
try to avoid the need for two independent variables whenever possible. 

3. Trees, unicycles, and bicycles. Let us say that a connected component has excess r 
if it belongs to Cr, i.e., if it has r more edges than vertices. A connected graph on n vertices 
must have at least n — 1 edges. Hence Cr = unless r > — 1. In the extreme case r = — 1, 
we have C_i = the family of all unrooted trees, which are acyclic components. In the 
next case r = 0, the generating functions Cq and Cq represent unicyclic components., which 
are trees with an additional edge. Similarly, Ci and Ci represent bicyclic components. In 
the present paper we shall deal extensively with sparse components of these three kinds, 
so it will be convenient to use the special abbreviations 

U{z) = C-i{z) = C-i{z) for unrooted trees; 

V{z) = Co{z) and V{z) = Co{z) for unicyclic components; 

W{z) = Ci{z) and W{z) = Ci{z) for bicyclic components. 
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According to a well-known theorem of Sylvester [37] and Borchardt [8], often attributed 
erroneously to Cayley [10] although Cayley himself credited Borchardt, we have U{z) = 
The other four generating functions begin as follows: 

t/(z) = 1.3 + |.- + S.= + fi.« + ...; 
W(z) = |. + + M,3 + S|,4 + m|I,5 + 2||l,e + . . . ^ 

= 1^- + + gz« + .. . . 

All of these generating functions can be expressed succinctly in terms of the tree 
function 

T{z) = V ^ = ^ + ^2 ^ 3^3 ^ . . . ^ (31) 

n>l 

which generates rooted labeled trees and satisfies the functional relation 

T{z) = ze^^'^ (3.2) 
due to Eisenstein [11]. Indeed, the relation 

Uiz) = T{z) - \T{zf (3.3) 
is well known, as are the formulas 

see [14]. We can prove (3.4) and (3.5) by noting that the univariate generating function 
for connected unicyclic multigraphs whose cycle has length k is 

T{zf _ 
2k ' 

summing over A; > 1 gives (3.4), and summing over A; > 3 gives (3.5). (If k = 1, the 
cycle is a self-loop; hence the multigraph is essentially a rooted tree and the compensation 
factor is |. If A; = 2, the cycle is a duplicate edge; hence the multigraph is essentially 
an unordered pair of rooted trees, and the compensation factor again is |. If A; > 3, the 
unicyclic component is essentially a sequence of k rooted trees, divided by 2A; to account 
for cyclic order and change of orientation.) 
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The generating function W{z) was shown by G. N. Bagaev [1] to be 

Wiz) = . 3.6 

24(1 -T(.))^ 

Then E. M. Wright made a careful study of all the generating functions Ck{z), which he 
called Wfc, in a series of significant papers [41, 43, 44, 45]. We will show below that the 
bgf for bicyclic connected multigraphs is 

Wiz) = . 3.7 

24(1 -T(.))^ 

The coefficients of powers of l/(l— T(2;)) arise in numerous applications, so Knuth and 
Pittel [24] began to catalog some of their interesting properties. For each n the function 
tniy) defined by 

J2'n{y)-T (3.8) 



{1-T{z))' ■ n! 



n>0 



is a polynomial of degree n in y, called the tree polynomial of order n. The coefficient 
of y'' in tn{y) is the number of mappings from an n-element set into itself having exactly 
k cycles. For fixed y and n — > oo, we have [24, Lemma 2 and (3.16)] 



We can, for example, express the number of connected bicyclic graphs on n vertices in 
terms of the tree polynomial t^, namely 

^ tniS) - S tn{2) + g tM) - II ^n(O) + ^ t.(-l) + ^ tn{-2) , (3.10) 

because (3.6) can be rewritten 

— . . _ 5 19 13 _ T_ 1 - T{z) {1-T{z)f 

~ 24{1- T{z)f 24{1 - T{z))' ^ T{z)) 12 + 24 + 24 

Equation (3.9) tells us that only the first term ^t„(3) of (3.10) is asymptotically signifi- 
cant. Extensions of (3.9) appear in equations (19.13) and (19.14) below. 

We can also express quantities like (3.10) in terms of Ramanujan's function [30] 

^, , n — 1 n — In — 2 n — In — 2n — 3 

Q(n) = 1 + h h h • • • 

n n n n n n 
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which Wright [41] caUed 1 + h{n)/n^. For we have 

tn{l) = ; tn{2) = n"(l+Q(n)) ; tn{y+2) = n^+t,(y+l) , y^O. (3.12) 

(See [24, equations (2.7), (3.14), and (1.9)].) Furthermore, we have 

[z^]Viz)='^n^-'Q{n); (3.13) 
this foUows from a weU-known formula of Renyi [31]. 

4. The cyclic components. For theoretical purposes it proves to be important to 
partition a multigraph into its acyclic part, consisting entirely of isolated vertices or trees, 
and its cyclic part, consisting entirely of components that each contain at least one cycle. 
The cyclic part can in turn be partitioned into the unicyclic part, consisting entirely of 
unicyclic components, and the complex part, consisting entirely of components that have 
more edges than vertices. A multigraph is called cyclic if it equals its cyclic part, complex 
if it equals its complex part. In this section and the next, we will study the generating 
functions for cyclic and complex multigraphs. The formulas turn out to be surprisingly 
simple, and they will be the key to much of what follows. 

Let F{w,z) be the bgf for all cyclic multigraphs, i.e., for all multigraphs whose acyclic 
part is empty. Formulas (2.5) and (2.11) tell us that 

^(^Ijj^z) = e.^o{w,z)+Ci(w,z)+--- _ ^C(w,z)-C-i{w,z) _ Q(^-fjj^2:) e-U{wz)/w.^ 

in other words, 

G{w,z) = e^^""^^/"" F{w,z) . (4.1) 

Indeed, this makes sense, because e'^^'^^^Z™ is the bgf for all acyclic multigraphs. We will 
analyze F = F{w, z) by studying a linear differential equation satisfied by G = G{w, z), 
and seeing that a similar equation is satisfied by F. 

Let 'Oyj be the differential operator w-^, and let "d^ be z^. The operator corre- 
sponds to marking an edge of a multigraph, i.e., giving some edge a special label, because 
'dyj multiplies the coefficient of w'^z^ by m. Similarly, corresponds to marking a vertex, 
because it multiplies the coefficient of w'^z^ by n. (For a general discussion of marking, 
see [16, sections 2.2.24 and following].) We have 

2 n 

4^ 2 n! 2 

hence G satisfies the differential equation 

-^^G = ^lG. (4.2) 
w 
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Again, this makes sense: The left side represents all multigraphs having a marked edge and 
an orientation assigned to that edge, and with the edge count decreased by 1. The right 
side represents all multigraphs with an ordered pair (x, y) of marked vertices. Orienting 
and discounting an edge is the same as marking two vertices. 
We can also write (4.2) in the suggestive form 

1 1"^ 



G{w, z) = e^ + - G{w, z) dw , (4.3) 

2 Jo 

using the boundary condition G(0, 2) = e^. (The generating function for all multigraphs 
with no edges is, of course, e^.) The operator t?^ corresponds to choosing an ordered 
pair (a;, y), and the operator i JJ" corresponds to disorienting that edge and blending it into 
the existing multigraph. (Notice that the English words "differentiation" and "integration" 
are remarkably apt synonyms for the combinatorial operations of marking and blending.) 

Most of our work will involve dz instead of so we shall often write simply d 
without a subscript when we mean -dg. The marking operator -d has a simple effect on the 
generating functions U{z) for unrooted trees and T{z) for rooted trees. Indeed, we have 

^U{z) = T{z), (4.4) 

because an unrooted tree with a marked vertex is the same as a rooted tree. Furthermore 

k>l ^ ' 

because a rooted tree with a marked vertex is combinatorially equivalent to an ordered 
sequence (Ti,T2, . . . ^T^) of rooted trees, for some k >1. The sequence represents a path 
of length k from the marked vertex to the root, with rooted subtrees sprouting from each 
point on that path. 

Now let U = U{wz)/w be the function C-i{w,z) that appears in (4.1), and let 
T = T{wz) = Co{w, z). We have 

, d Uiwz) wU'(wz) Tiwz) T 
dzU = z- = z = z = - ; 

oz w w wz w 

, d U(wz) fzU'iwz) U{wz)\ T-U 
VwU = w— =w — 

Thus 



dw w \ w w"^ ) w 2w 

2 ,2 



^wU = {i9z Uy . (4.6) 
w 

In words: "Orienting and discounting an edge of an unrooted tree is equivalent to con- 
structing an ordered pair of rooted trees." 
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We are now ready to convert (4.2) into a differential equation satisfied by F = F{w, z): 



^iG = e''{{'dlU)F + {-d.UfF + 2{'d,U){d,F) + ^If) . 
Tlierefore, using (4.6), we liave 

'^^^F = {'&lU)F + 2(^,C/)(7?,F) + ^IF. (4.7) 

And like our other formulas, this one makes combinatorial sense as well as algebraic sense: 
The left side tells us that the right side should yield all ways that the cyclic part of a 
multigraph can grow, since ^"^wF is the number of ways it can go backward one step. 
The first term on the right corresponds to marking two vertices of an unrooted tree (in the 
acyclic part of the multigraph); joining them will produce a unicyclic component, thereby 
increasing the number of components in F. The middle term corresponds to marking a 
vertex in some tree of the acyclic part and another vertex in the cyclic part; joining them 
will add new vertices to one of F's existing components. The remaining term corresponds 
to marking two vertices in the cyclic part. If such marked vertices belong to the same 
component, say a component of excess r, a new edge between them will change the excess 
of the component to r + 1. Otherwise, the marked vertices belong to different components, 
having respective excesses r and s, possibly with r — s; joining them will merge the 
components into a new component of excess r + s + 1. 

Similarly, we can proceed to study the bgf E{w^ z) for the complex part of a multi- 
graph, the part whose components all have positive excess. (The letter E stands for excess.) 
We have 

F{w,z) = e^^'^'^ E{w,z), (4.8) 
where V — V{wz) generates unicyclic components. It is easy to verify the identity 

^^^V = ^lU + 2{^,U){^,V) , (4.9) 
which corresponds to a combinatorially evident fact. Indeed, 

Therefore we find 

--^^E = {r^V)E+{^,VfE + 2{^,U){^,E) + 2{^,V){^,E) + i&lE. (4.11) 
w 
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5. Enumerating complex multigraphs. To solve the differential equation (4.11), we 
can first write it in the form 

^{^^-T^,)E = ]^e-^^le^ E. (5.1) 
Now we partition E — E{w, z) into terms of equal excess, as we did for C(w, z) in (2.11): 

E{w,z) = ^Er{w,z) = ^w''Er{wz) . (5.2) 

r r 

The univariate generating function Er{z) represents all complex multigraphs having exactly 
r more edges than vertices; in particular, Eq{z) = 1, since only the empty multigraph is 
"complex" and has excess 0. Differentiation yields 

= ^{rw''Er{wz)+w''{^Er){wz)) , 

r 
r 

where {i}Er){wz) here means dzEriz) with the argument z subsequently replaced by wz, 
namely wzE'^iwz). Therefore, if we equate the coefficients of w'^~^ on both sides of (5.1) 
and set w = 1, we obtain a differential recurrence for the univariate generating functions 

Er = Er{z): 

{r + 'd-Td)Er = \e-^ d'^e^ Er-i. (5.3) 
It is convenient to introduce a new variable 

and to express E^ in terms of Q instead of z. Note that 

Equation (5.3) now takes the form 

(r + (1 + 0-H)Er = 1(1 + Cr^'H^l + O'/'K-i , (5.6) 

since = l/(l-T(^))^^^ = (1+C)^/^ by (3.4). We will see later that the variable C, which 
represents an ordered sequence of one or more rooted trees, has important significance in 
the study of graphs and multigraphs. 
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In the C world, with still denoting Zj^, we have the operator equation 

^•/(C) = /'(C)C(i + Cf + /(C)^, (5.7) 

because 

dC T{z) f 1 1 A T _ ... , 



Equation (5.7) allows us to commute with functions of For example, we find 

(1 + cr'/'^ii + 0'^' = (1 + c)-^/'(i(i + c)-^/\(i + 0' + (1 + 0^/'^) 

= iC(i + C) + ^; 

hence (5.6) can be rewritten 

(r + (1 + C)-'^)Er = \ (|C(1 + + ^)'^r-l . (5.8) 
To simplify the equation even further, we seek a function /r(C) such that 

^ • /r(C) = (1 + C)/r(C) (r- + (1 + C)-'^) ; 

then the differential equation (5.8) will become 

^{fr{C,)Er) = i(l+C)/r(C)(|C(l + C)+^)'£^r-l, 

which can be solved by integration. According to (5.7), the desired factor /r(C) is a solution 
to 

/r(C) _ r _r _ f 
friO " C(i + C) " C 1 + C' 

so we let /r(C) = C'^(l + C) *^) which incidentally equals T{zY. We have derived the 
equation 

''((^)-^(T^Gc(i + + ^)V.. (5.9) 

This differential equation determines Er uniquely when r > 0, given Ej.-i, since vanishes 
when z — 0. 

Now all the preliminary groundwork has been laid, and we are ready to calculate Ej.. 
We know that Eq = 1. A bit of experimentation soon reveals a fairly simple pattern: We 
can prove by induction on r that the solution to (5.9) has the form 

EAz) = f;e..(l + O'V'-' = t .r"!.';''!-. . (5.10) 
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where the coefficients e^d are rational numbers, and where e^(2r) = for r > 0. Let e^d = 
when d < or d > 2r. Assuming that (5.10) holds for some r, we use (5.7) and (5.8) to 
compute 

^r=(K(l + C)+^)£^r 

2r /I 



d=0 
2r+l 



2r-d 

c 



J2^rd{i + cy^'c 



-2r+l-d 

Ord = (3r + i - (i)ej.d + (2r + 1 - c?)er(d_i) ; (5.11) 



5r=(|C(l + C)+^)^r 

= E ^^'^(1 + c)^+^c^^+^-'^c(i + c)^ + ^ + 



c 



d=0 



brd = (3r + I - d)a^d + (2r + 2 - d)a^(d-i) ■ (5.12) 
Moreover, the left side of Equation (5.9) is a polynomial, 

^(r(i + c)-^i?r) = ^ E erdC''^"' = E(3^ - + o'c'^-'' • 

d=0 d=0 

The corresponding polynomial on the right-hand side is 

|C(i + C)^-'-(K(i + + ^)'Er-i = \ Ed + 0^^'"-'^ ; 

therefore we can complete the induction proof by setting 

erd = ' < d < 2r. (5.13) 

It is easy to check that a^(2r+i) = and 6j.(2r+2) = 0, hence er(2r) = when r > 0. 
In particular, aoo = |, &oo = f j ^oi = |) and we obtain 

. , . c)(AC^ . ic) = (^^ . i A) . 

A complex multigraph of excess 1 must consist of a single bicyclic component, so Ei{z) 
is the function we called 14^(2;) in (3.7). If our only goal had been to compute W{z), we 
could of course have gotten this result easily and directly. The more elaborate machinery 
above has been developed so that the generating function Ej.{z) can readily be computed 
and analyzed for larger values of r. 
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6. Enumerating complex graphs. For graphs instead of multigraphs, the calculations 
are more intricate, but it is instructive to look at them and see how they differ. As in (4.1) 
and (4.8), we separate off the cyclic and complex parts of the bgf by writing 

G{w,z) = e^(^^)/^ F(«;, ^) ; F{w,z) = e^^""^^ E{w, z). (6.1) 

Adding a new edge to a graph means that we want to mark an unordered pair of dis- 
tinct vertices, and the operator corresponding to this is |('!?^ — '??«)• We must also avoid 
duplicating an edge that's already present, so we must also subtract -dw. Therefore the 
differential equation satisfied by G is not (4.2) but 

KG=i^—^-^^]G; (6.2) 



w V 2 

and the integral equation corresponding to (4.3) is 



G{w,z)^e^ + J i-^^^-'d^\G{w,z)dw. (6.3) 

A computation similar to our derivation of (4.7) now leads to a differential equation defin- 
ing F: 

-d^F = ((^1:1^-^ \ u]f+ {d,U){d,F) + F. (6.4) 



w \\ 2 

The analog of (5.1) turns out to be 

i {d^E - T^J) = e-^ (^^^^ - E ; (6.5) 

converting to univariate generating functions Er{w,z) = w^Er{wz) yields 

{r + ^-T^)Er = e-^ il-r+ ^ J K-i • (6.6) 

Again we multiply by the integration factor {1 + (Y, but the differential equation turns 
out to be rather messy: 

J( C Vg^ f C \(, C^(10 + 14C + 5n C^-3C-3 

(6.7) 

At least it is linear, and it allows us to compute E^ for small r. It turns out that the 
solution has the form 

Er = Ee.,^^^^ = |^/'-'^(l-T(.))3-'^ ' ^^-^^ 
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for appropriate coefficients e^d. We have, of course, cqo = 1 and eod = for d 7^ 0. When 
r > 0, the values of Crd satisfy the foUowing recurrence, equivalent to (6.7): 

6 

(3r - d)erd + (6r - d + l)e^^d-i) = X^Cj(r - 1, d)e^r-i)id-j) , (6-9) 

3=0 

where 

co(r, d) = (6r - 2d + 5)(6r - 2rf + l)/8 , 

ci(r, d) = (132r2 + (166- 80c/)r + 45 - 50c/ + 12rf2)/4, 

C2 (r, d) = (398r2 + (584 - 220d)r + 205 - 160d + 30d^) /4 , 

C3(r,d) = (316r^ + (515- 160d)r + 207- 129d+20d^)/2, (6.10) 

C4(r, d) = (279r^ + (484 - 130d)r + 208 - 112d + 15d^) /2 , 

C5(r, d) = (13r-3d+10)(5r-d + 5), 

C6(r, d) = (25r2 + (43-10(i)r + 18-9ci + (i2)/2. 

It is not at all obvious that this recurrence has a solution. We can use it to compute e^d 
for (i = 0, 1, . . . , 3r — 1, but then the value of er(3r-i) must satisfy a nontrivial equation 
when we set d = 3r. To get the values of e^d when d > 3r, we can start by assuming that 
Crd — ior d > 6r and work backward. We will prove later that the recurrence always 
does have a solution, and that the last nonzero coefficient for fixed r can be completely 
characterized by an almost unbelievable (but true) formula: If (^2^) < r < (^2^), then 

Moreover, e^d = for all d > 5r — s. Here is a table of values for small r, in case the reader 
would like to check a computer program that is based on the formulas above: 



d = 





1 


2 


3 


4 


5 


6 


7 


8 


9 


10 


eod = 


1 
























5 


1 




















eid = 


24 


4 






















385 


175 


133 


79 


49 


5 


1 










e2d = 


1152 


96 


32 


16 


16 


6 


24 












85085 


5005 


97097 


7777 


43621 


200561 


950569 


14001 


7021 


773 


3 


hd 


82944 


512 


2304 


72 


240 


960 


5760 


160 


240 


144 


8 



7. A surprising pattern. The numbers e-rd that characterize cyclic graphs of excess r 
do not appear to have any nice mathematical properties. But when we calculate the 
corresponding coefficients Srd for multigraphs, as defined in (5.11)-(5.13), we run into 



18 



patterns that cry out for explanation. For example, here is a table showing the values for 
small r: 



d = 


u 


1 
1 


o 
z 


Q 
O 


A 


r: 
O 


R 
O 


7 


Q 

o 


eod = 


1 


















eid = 


5 

24 


1 

8 
















e2d = 


385 
1152 


35 
64 


91 
384 


1 

48 












esd = 


85085 
82944 


25025 
9216 


23023 
9216 


2849 
3072 


19 
160 


1 

384 








e4.d = 


37182145 


11316305 


3556553 


3658655 


1656083 


8723 


1969 


1 




7962624 


663552 


147456 


221184 


294912 


10240 


46080 


3840 




e5d 


5391411025 


929553625 


7994161175 


8068525465 


341105765 


327803333 


1606891 


140569 


4043 


191102976 


7077888 


31850496 


31850496 


2359296 


7077888 


207360 


245760 


322560 



Anybody who has played with integers knows that the numerator of 632, 23023, is equal 
to 7 ■ 11 ■ 13 ■ 23; moreover, the denominator is 9216 = 2^^^ ■ 3^. Further experiments show 
that the factorization of, say, 655, is 2~^^ ■ 3"*^ ■ 11 ■ 13 ■ 17 ■ 19 ■ 47 ■ 151. The occurrence of 
so many small prime factors cannot be a coincidence! 

It is, in fact, easy to see the pattern in the numbers Cro, which satisfy the recurrence 



(6r- l)(6r-5) 



-1)0 



(7.1) 



according to rules (5.11)-(5.13). The numbers e^o also satisfy the same recurrence, ac- 
cording to (6.9) and (6.10). Therefore we find 



CrO — CrO — 



(6r)! 



25^32^ (3r)!(2r)! 



(7.2) 



But the recurrence defining Crd for d > is much more complex, and we have no a priori 
reason to expect these numbers to have any mathematical virtues. The following theorem 
provides an algebraic explanation of what is going on. 



Theorem 1. The numbers e^d deRned in (5.10) can be expressed as 

{6r - 2dy. Pd{r) 
^""^ ~ 25r 32r-d [Zr - d)\ {2r - d)\ ' 

where Pd{r) is a polynomial of degree d defined by the formulas 

Pd{r) = [z^]F{z)^^-\ 
(4z)^ 



(7.3) 



F{z) = 3\J2 



n>0 



6 



(n + 3)! {AzY 



e--M!-4.-l 



(7.4) 
(7.5) 
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Proof. By the duplication and triplication formulas for the Gamma function, expression 
(7.3) can also be written 

yrd dy yrd 3'^+'^ 27rr(r+l-f ) r(r+i-f ) ^ ^ 

Therefore recurrence equation (5.11) becomes 

ard = 3{r+l-f)grdPd{r) + 2(r+|-f )5r^(d_i)Pd_i(r) 
= 3(r+i-f)(7.dArf(r), 
Ad{r) = Pd{r) + IPd-i{r) ■ (7-7) 

Similarly, but without as much cancellation, (5.12) becomes 

brd = 3ir+l-l)3{r+l-l)grdAdir) + 2(r+l-f )3(r+i-f )(7,(d-i)Ad_i(r) 

= l9r{d-i)Bd{r) , 

Bd{r) = (r+|-|)(r+i-f )Ad(r) + |(r+l-f )(r+|-|)^d_i(r) . (7.8) 



Relation (5.13) becomes 



9(r+l-|)(r+f-f)(r+i-|) 



(3r + 3 - d)g^r+i)dPd{r + 1) = " , ! 1 ^^grid-i)Pd{r + 1) = -Kd ; 

4 (r+1-2) ^ 

hence the original recurrence takes the following form: 

(r+l-f )(r+|-|)(r+i-f )Pd(r + 1) = (r+l-f )i?,(r) . (7.9) 

The boundary conditions are 

Pd(r) = forcZ<0; Po(r) = 1; P2did) = for d > 0. (7.10) 

It is by no means obvious that a polynomial Pd{r) will satisfy (7.7), (7.8), and (7.9). 
The key observation that makes everything work is that a solution to the simpler recurrence 

+ i - DPdir + i) = (r + i - i)Ad{r) (7.11) 

suffices to solve the more complex one. This new recurrence is sort of a "half step" between 
solutions of (7.7), (7.8), and (7.9); it tells us about multigraphs whose excess is an integer 
plus |, whatever that may mean. 
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A solution to (7.11) in the extended domain implies a solution to (7.9). For we will 
then have 



(r+l-|)(r+|-f )(r+i-|)Prf(r + 1) = (r+|-f )(r+i-|)(r+l-f )A,(r + \) 

and 

(r+l-f )S,(r) = (r+l-f )(r+|-f )(r+i-f )A,(r) + |(r+l-f )2(r+i-f )A,_i(r) 
= (,+l_|)(,+ |_|)(,+ i_|)P,(,+ i) 

+ |(,+l_|)(,+|_|)(,+ i_|)P,_,(, + 1) . 

Moreover, Pd(f ) = when (i > 0. 

We can solve the simultaneous recurrences (7.7) and (7.11) by constructing solutions 
to (7.7) that have the desired form (7.4), namely 

Pair) = [z''] F{zf^-'' , A^ir) = [z"] F(z)2--'^(l + lzF{z)) , 

and noting that the function F{z) of (7.5) satisfies 

i?F(^) = 4zF{z) + 3 - 3F{z) . (7.12) 

Thus we have 



dPrf(r+i) = [^'^]^(F(^)2-+i-^) 

= [z'^] (2r + 1 - d)F{zf'^-'^{4:zF{z) + 3 - 3F{z)) 
= i6r + 3-3d){Adir)-Pdir+l)), 

and (7.11) holds. □ 

Incidentally, the theory of confluent hypergeometric functions provides us with alter- 
native expressions for the function F{z) in (7.5). We have, for example, 

F{z) = F(l; 4; 4^) = 3 / e^^*(l - tf dt 

Jo 

The general theory of [23] also allows us to write 

9r — d 

Pd{r) = ^[z'']G{zr, (7.14) 
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where G{z) = 1 + z - ^z^ + j^z^ - j^z"^ + ^z^ - ^^z^ + • • • is defined impficitly by 
the relation 

G{zF{z)) =F{z). (7.15) 
Corollary. For fixed d>0 we have 

as r — > oo. Moreover, Crd is a rational number whose numerator has at most 

d + 0{d{logdf/logr) (7.17) 

prime factors greater than 6r, and whose denominator has no prime factors greater than 3r. 
Proof The obvious bounds 

2r—d ^ r„di ttI — d 



ir-d - 1 



tell us that Pd(r) = {Ir fj d\^0{r'^-^). Formula (7.16) now follows from (7.3) and Stirling's 
approximation. (We will derive a more precise estimate, suitable when d varies with r, in 
section 23 below, Lemma 8.) 

All prime factors greater than 6r must appear as prime factors of -Pd(r). We will prove 
the upper bound (7.17) by showing that mrfPd(r) is an integer, where 

rud = 5L'^/2J6L^/3J7L'^/^J . . . = J](fc + 3)Wki . (7.19) 

k>2 

It will follow that the denominator of -Pd(r) contains no prime factors greater than 2r+ 1, 
and that if the numerator contains k prime factors greater than 6r, we have (6r)'^ < 
TnridPdir) < md{^^^^) < md{2r)^; i.e., klogGr < (i log 2r + log = (ilog2r + 0((i(log(i)^). 

The coefficient of z in any power of F{z) is a sum of terms ftftfS'' ■ • • ^ where 
fj = [z^] F{z) = II • • • ki + 2/c2 + 3A;3 -\- ■ ■ ■ = d. Thus, for example, the factor 7 

occurs in the denominator of fi^f2^fs^ ■ ■ ■ exactly k^ + k^ + ■ ■ ■ < d/A times. It follows 
that the denominator of Pd is a divisor of rud- □ 

The estimate (7.17) can be sharpened for small cZ, because -Pd(r) always has (2r — d) 
as a factor when cZ > 0. For example, 

„ . (r-l)(10r-7) (2r-3)(10r2-21r + 10) 
Pi(r) = 2r-1, P2(r) = ^ '-^ P^{r) = ^ '-^—^ '-. 
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There are no prime factors > 6r when d < 1, and there is at most one when d <3. 
Instead of writing 

it is sometimes convenient to use coefficients e'^^ such that 

2r , 



Er{z) = J2 



,3r-d 



The following table shows that the numbers e^^ tend to alternate in sign: 



(7.20) 



d = 





1 


2 


3 


4 


5 


6 


7 


8 




1 


















e'id = 


5 

24 


7 

24 


1 

12 














^2d — 


385 
1152 


455 
576 


77 

128 


43 
288 


1 

288 










^3d — 


85085 


95095 


119119 


201355 


38623 


803 


139 






82944 


27648 


27648 


82944 


69120 


34560 


51840 






^4d — 


37182145 


40415375 


141292151 


62775713 


116866321 


15867137 


850003 


25129 


571 


7962624 


1990656 


3981312 


1990656 


7962624 


4976640 


4976640 


1244160 


2488320 



Again, patterns lurk beneath the surface, and there is a prevalence of small prime factors; 
for example, -e'55 = ^g^oSf = 2"^® • 3-^ • 11 • 13 • 17 • 19 • 23 • 31 • 229. We can in fact 
prove the existence of a pattern similar to that of the original coefficients Crd' 



Corollary. The numbers e^^ defined in (7.20) can be expressed as 

(6r-2d)!Qd(r) 



25r32r-d (Sr - d)\ {2r-dy. ' 
where Qd{f) is a polynomial of degree d for which Qd(f ~ |) =0 when d > 0. 
Proof. By definition, we have 

d 



(7.21) 



^rd 



k=0 



2r-k 

d-k 

2r-k 



(-l)'-'e,fe. 



(7.22) 



because the quantity T^^-'^ = (l-(l-T))^'^ " contributes {^^~^) {-1^-'' to the coefficient 
of (1 - T)'^-^^. Now if we plug in equations (7.3) and (7.21), we find that 



Qdir) = E 

k=0 
d 

= E 



(-l)^-'=Pfc(r) (6r - 2k)\ (3r - rf)! 
3d-k(d-ky. {6r - 2dy. {3r - ky. 



k=0 



d-k 



3r - - i\ 



d-k J 



k=0 
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clearly a polynomial in r of degree < d. In fact, the leading term is 



3; {d-k)\ k\ d\ ' 

so the degree is exactly d. If we set r = | — i, the sum reduces to ~ |)) which we 

know is zero for (i > by (7.11). □ 

It is interesting to try to compute the coefficients e^^ directly, by proceeding as we did 
in section 5 but using the variable ^ = 1 + C=(l— T{z)) in place of (. The calculations 
are essentially the same, even slightly simpler, until we get to the analog of equation (5.13); 
the recurrences that replace (5.11)-(5.13) are 

<d - (3r+i-d)e;, - (3r+| - d)e'^^^_,^ ; (7.24) 
(3r-d)e;, - (2r+l-d)e;(,_,) = |((3r-i-d)a;,_,), - {Sr+'^-d)a[^_,^^,_,^) . (7.25) 

It appears to be quite difficult to derive (7.21) directly from these recurrences. The 
recurrence for Qd{r), corresponding to equation (7.9) for Pd{r), turns out to be 

- Di^-hDQdir) = ir- l)ir-l-l)Qdir - 1) 

+ |(-+|-f)(--|-f)Q^-iW 

- 4(r - f )(r-i-|)(r - |)(r-i-f )-iQ,_i(r - 1) 

+ 4(r+i-f)(r-i-f)Qd_2(r-l), (7.26) 

and we can proceed to solve it for d = 1, 2, . . . , if we first multiply both sides by the 
summation factor r(r-|)r(r-i-|)r(r+l-|)-ir(r+i-|)-^ The equation for d > 
then takes the form 



Sd{r) = Sd{r - 1) + gd{r) + gd{r - |) , 
r(r+l - ^) r(r+i--) 

r(r+l^J)r(r+|-|) 

''^('•)= r(.+i-i)r(.+i-i) ^t--'' 

where /d(r) = Qd{f) — r-d/3 ^^(^ — |) is a polynomial of degree d — 1. For example, 
fi{f) = — I and f2{r) = l'^ ~ |- There is apparently no analog of the simple relation 
(7.11) that made everything work nicely in the theorem above. 
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A generating function for Qdif)-, anafogous to (7.4), can be found by analyzing (7.23) 
more carefully. Let H{z) satisfy 

H{z) = F{z H{z)-^'^) = l + ^ + ^^2^3^^3 + -- -; (7.27) 

then the elementary theory in [23] proves that 

{x - f ) [z'^] H{zf = X [z"^] F{zf-'^'^ . (7.28) 

Hence, by (7.11) and (7.4), 

I 1 _ d 

Aa(r) = —^^ 1 [z"] F{zf^+^-'' = [z''] H{zf''+^-'"'/^ . (7.29) 

^+2-2 

And (7.23) can therefore be "summed": 

'3r-d + /c+i 



2/3)(-3r-3/2+d-fe) 



/A N -3r-3/2+d 

= [z'']\^-z + H{z)-^/^^ . (7.30) 

In particular, 

Qo(r) = 1; Qi(r) = -2(r + |); Q^ir) = 2(r - |)(r - |). 

Although Qi(r) = -Ai{r) and Q2{r) = A2{r), we have Q^ir) = -A^ir) + ]^(r - |). 

8. Sparse components. We can readily compute the univariate generating functions 
Ci{z), C2{z), Cs{z), . . . , Cr{z) for bicyclic, tricyclic, tetracyclic, . . . , (r + l)-cyclic com- 
ponents, now that we know the simple form of £'1(2;), E2{z), Es{z), . . . , Er{z), because of 
the fact that 

J^^^'Er = exp( J^w^aj- (8.1) 



Differentiating this formula with respect to w and equating coefficients of w'^ ^ leads to 
the expression 

r 

rEr ^ Yl kCkEr-k, (8.2) 
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from which we may find by calculating 

1 ""^ 

Cr = Ej. — — ^ ^ k Cfc -E'j.—fc . (8-3) 

Since we know that Er = {1 + CY Yl^d=o ^rdC^^~'^ for r > 0, it follows by induction that 
Cr can be written in the same form, 

2r-l 

Cr = {i+ cr E ^-rfC''-' , (8.4) 

for appropriate coefficients Crd- (The variable ( stands for T{z) / (l—T{z)) , as in section 5.) 
Indeed, relation (8.3) tells us that we can compute Crd by evaluating a double sum 

1 """^ 

Crd = Crd E ^ E ^kjC{r-k){d-j) \ (8-5) 

the inner sum here is over the range max(0, (i + 1 — 2r + 2k) < j < min((i, 2/c — 1), 
which is always nonempty for < /c < r except when d = 2r — 1. We always have 
Cr(2r-i) — Cr{2r-i) — 1/ (2^"'"^(r + 1)!). Here is a table of the coefficients for for small r: 

6 7 8 9 



27 1 

640 3840 

_ 1573507 2597 803 1 

^oa — 3072 3072 1536 41472 864 11520 207360 4608 64512 46080 

In applications, the leading coefficients c^o of Cr are the most important, as are the 
leading coefficients e^o of Ej., because these govern the dominant asymptotic behavior of 
[z"^] Cr{z) and [z'^] Er{z). Therefore it is convenient to write 

Cr — CrO, Cr — Cj-Q ■ (8-6) 

We have seen in (7.2) that there is a simple way to express the numbers in terms of 
factorials. The values c^- are then easily computed by using relation (8.3), but with and 
Cr substituted respectively for Cr and E^-. 

Asymptotically speaking, the values of Crd and equivalent when r is large. 
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16 


48 


48 
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1105 


985 


1373 


515 


223 
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1152 


384 


576 


576 


1920 


384 


565 


12455 


26581 


12227 


2089 


9583 


128 


768 


1152 


768 


384 


11520 


82825 


387005 


371195 


10154003 


121207 


519883 
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Theorem 2. For fixed d> we have 



cra = era{l + 0(r-^)) = ^ ^"+^^,'^' (1 + 0(r-^)) (8.7) 



as r — > oo. 



Proof. We know the asymptotic value of e^d from (7.16). To complete the proof, we need 
only show that the double sum in (8.5) is Od{erd/f)i where implies a bound for fixed d 
as r — > oo. 

Since Crd < ^rdi each term in the double sum is bounded above by an absolute constant 
(depending on d) times 

2,'' k{k + j -l)\{r-k + d- j 3"^ k {r + d - 2)1 fd\ / fr + d-2^ 



2^r j\ (d-jy. 2'"r d\ \j J / \k + j - 1 

We have (^^^~^) > r + d — 2 except when k = 1 and j=OorA; = r — 1 and j = d. 
Therefore all but one term is Od{erd/r^), and the exceptional term is Od{erd/f). There 
are 0{rd) terms altogether, so the overall double sum is Odierd/f). □ 

The simple form (8.4) of Cr{z), the generating function for (r + l)-cyclic multigraphs, 
makes it possible for us to deduce a formula for the corresponding graph-based function 
Cr{z)^ which turns out to be only about 50% more complicated. In fact, we will prove 
a result that applies to the generating functions for infinitely many models of random 
graphs, including both G{w, z) and G{w, z) as special cases. 

Our starting point for this calculation is the formal power series relation 

G{w,z) = G{\■n{l + w),z/^/TTw) . (8.8) 

which is an immediate consequence of (2.7) and (2.9). It follows that 

C{w,z) = G{\n{l + w),z/y/TTw) . (8.9) 

We can therefore obtain a near-polynomial formula for Cr{z) as a special case of the 
following result. 

Theorem 3. If f{w) = 1 + fiw + + ■ ■ ■ and g{w) — l-\-giw-\- g2W^ + ■ ■ ■ are arbitrary 
formal power series with f{0) — g{0) — 1, and if 

C{w, z)^G (wf{w),z^^ = ^ w^Criwz) , (8.10) 

where G is the bgf (2.10) for connected multigraphs, then there exist coefficients c^d such 
that 

3'^+2 3r+2 rpf ^3r+2-d 
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for all r > 0. 



Proof. Consider Ramanujan's function Q{n) of (3.11), which has the asymptotic value 
+ 0(1) as n oo. Following Knuth [22], we shall say that a function s{n) of the 
form p{n) + q{n)Q{n) is a semipolynomial when p and q are polynomials. The degree 
of a semipolynomial is computed by assuming that Q{n) is of degree |. For example, 
3 + 2n + (1 + n)(5(n) is a semipolynomial of degree |. More formally, if d is any nonnegative 
integer, the semipolynomial p{n) + q{n)Q{n) has degree < ^d if and only if p has degree 
< ^d and q has degree < 

The formulas (3.12) of section 3, taken from [24], show that generating functions of 
the form F(z) = Ylk=i '^k/{^ — T{z))'' are precisely those whose coefficients satisfy 

F{z) = ^ 

where s{n) is a semipolynomial of degree < ^{d — 1). 
Consider now the expansion 

^w''f{wyCr{zwg{w)) = y^w'' C'rjwz) 

r r 

which follows from (8.10) and (2.11). We will study how each term on the left contributes 
to terms on the right. First, when r = — 1 we have 

U{zwg{w)) _ ^ n"-2^"t(;"-^(l + giw + 



wf{w) n!(l + /!«; + •••) 

^n^n-l ^ _|_ jip^[ji^yj _|_ npi{n)w'^ + ■ ■ ■ ) 



where each pi{n) is a polynomial of degree < I. The effect is to make C-i{z) = U{z), and 
to contribute a linear combination of U{z), T{z), and (l — T{z)) . . . , (l — T{z)) '^^'^^ 
to Ci{z) for each I > 0. Next, when r = we have 

V{zwg{w)) = ^ ^n'^~'^Q{n)z'^w'^{l^-npQ{n)w^-npi{n)v? -\ ) ; 

n>l 



this contributes V{z) to Cq{z) and a linear combination of (l ~T[z)^ , . . . , (l — T[z)) 
to Ci{z) for each / > 0. Finally, when r > we have, by (5.11), 

w'^ f{wyCr{zwg{w)) = ^ T—^z'^w'^^'^(l + npo{n)w + npi{n)w'^ H )f{wy , 

n>0 
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where s{n) is a semipolynomial of degree < — ^- This contributes a hnear combination 
of (1-T(z))~\ . . . , {1-T{z))~^^~'' to Ci{z) for each I > r. The proof of (8.11) is complete, 
because U{z) = iC(2 + 0/(1 + 0' and T{z) = C/(l + 0- □ 

Incidentally, our proof shows that the only contribution to the coefficient of the "lead- 
ing term" T{zf^+'^/{l-T{z)f^ ofCi{z) comes from Ci(z) itself. Therefore Cr{z) and Cr{z) 
have identical leading coefficients. In particular, Cj-o = c^o = Cr- We will see below that 
this gives the same asymptotic characteristics to the limiting distribution of component 
types in the uniform and permutation models when m ~ |n. 

Theorem 3 justifies our earlier assertion that the recurrence (6.9)-(6.10) for Crd has 
a solution. The coefficients Crd can be computed from those coefficients Crd using the 
relation Cr = — ^ X]fc=i kCkEr-k, but that makes Cr a polynomial of degree 5r with 
denominator (1 + C)^^5 so the numerator and denominator must be divided by (1 + C)^^~^- 
A simpler recurrence for Cr was found by Wright [41], who proved Theorem 3 in the special 
case Cr = Cr by a different method. Translated into the notation of the present paper, 
Wright's recurrence is 

^(yTc) ^ 2(1 + 0-^ (^Xj(^g,)(^a-i-,) + (^^-3^-2(r-l))a-i^ , r>0, 

(8.12) 

with "dCo = |C^(1 + C)~^- As we saw for the related sequence Er in section 6, it isn't 
obvious that this recurrence has a solution of the desired form 

3r+2 3r+2 ( ., y^ r +2 - d 

Cr{z) = J2 Crde^'-\l + C)-^ = E ^rd ^^ .^-d ' (^'l^) 

d=o d=o il-^(^)j 

when r > 0. Theorem 3 provides an algebraic proof, while Wright proved the existence by 
a combination of algebraic and combinatorial methods that we will consider in the next 
section. Here is a table of the first few values of the coefficients: 
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C2d = 


16 


48 
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24 




















1105 


395 


15131 


2399 


8303 


557 
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C3d = 


1152 


72 


1152 


144 


720 


144 
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565 


26165 


133651 


523789 


80573 


317611 


77773 


89 


839 


1 








C4d = 


128 


768 


1152 


2304 


288 


1440 


720 


3 


240 


12 










82825 


67005 


1770535 


31448897 


438258631 


1146749 


86265 


304411 


25180997 


109627 


781 


439 


1 


C5d 


3072 


256 


1536 


10368 


82944 


180 


16 


96 


20160 


360 


20 


240 


120 



Notice that Crd = for sufficiently large values of d; we do not have to go all the way 
up to d = 3r + 2. In fact, we will see in the next section that the final nonzero coefficient 
is Cr(3r+2-s) when (*2^) < r < (*2^), and it has the value exhibited in (6.11). 
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The asymptotic value of the leading coefficients Cro = Cro = Cr has an interesting 
history. Wright [44] gave a complicated argument establishing that c^-o is asymptotically 
(l)'^(r — 1)! times a certain constant, for which he obtained the numerical value 0.159155. 
Stepanov [35] independently computed the numerical value '0,46. . . ' for three times the 
constant; the approximation 0.48 would have been more accurate, but Stepanov was per- 
haps conjecturing that the true value would be | + (\/3+ln(2 — \/3 )) ^ 0.46546, which he 
announced at the same time in connection with another problem concerning the size of the 
largest component when the centroid is removed from a random tree. Wright's constant 
was identified as l/27r by G. N. Bagaev and E. F. Dmitriev [2], who presented without 
proof a list of asymptotic expressions for the solution of several related enumeration prob- 
lems. Lambert Meertens independently found a proof in 1986, but did not publish it; 
his approach was reported later in [3]. A detailed analysis was also carried out by V. A. 
Voblyi [38], who obtained a number of interesting auxiliary formulas. In particular, if we 
write c{z) = ciz + c^z^ -A- c^z^ + • • • , Voblyi proved the formal power series relation 

In other words, he proved that the coefficients show up in the asymptotic series 

-^-2/3(1/32:) 1 ^ Q 2 « 3 o 3 

T n /o \ ~ 1 - o - 3ci2 - 6C2Z - 9c3Z , (8.15) 

-'l/3(l/'52;) 2 

as ^ ^ 0. This is interesting because the left-hand side can also be expressed as a continued 
fraction 

2z + , (8.16) 

8^+ 

Uz+ 



1 

20^ + 



26^ + • • • 

using the standard recurrence zli,+i{z) = zlj^-i{z) — 2i/I^{z) for the modified Bessel 
functions Iv>{z). In the course of his investigation, Voblyi noticed that the coefficients 
of e^^^^ have a simple form, although he did not mention their combinatorial significance; 
these are the numbers we have called e^,. He gave the formulas 

= (-1)^(1/3, r)=£(:±^Z5W±M, (8.17) 

which are equivalent to (7.2). Here (i^, r) denotes Hankel's symbol, 

■ k=l 
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9. Structure of complex multigraphs. The generating functions Er, C^, (1 + Q'^'^E^, 
and (1 + C)^C'r cire polynomials in (, and these polynomials have a combinatorial interpre- 
tation that provides considerable insight into what is happening as a graph or multigraph 
evolves. The inner structure in the case of Cr was studied by Wright in his original paper 
[41]; we will see that his results for graphs become simpler when we consider the analogous 
results for multigraphs. 

Let M be a cyclic multigraph of excess r, i.e., any multigraph with no acyclic com- 
ponents, having r more edges than vertices. We can "prune" M by repeatedly cutting off 
any vertex of degree 1 and the edge leading to that vertex; this eliminates as many edges 
as vertices, so the pruned multigraph M still has excess r. Each vertex of M has degree 
at least 2. Such multigraphs are called smooth. 

Conversely, given any smooth multigraph M, we obtain all multigraphs M that prune 
down to it by simply sprouting a tree from each vertex of M (i.e., identifying that vertex 
with the root of a rooted tree). Since T{z) is the generating function for rooted trees, it 
follows that 

Fr{z) = Fr{T{z)), (9.1) 

where Fj. (z) is the generating function for all cyclic multigraphs of excess r and F^ is the 
generating function for all smooth multigraphs of excess r. Thus, for example, we must 
have 

F,iz) = ^^z{3 + 2z)/il-zy/\ (9.2) 
because we know from (3.4), (4.8), (5.2), and (5.14) that 

F,{z) = e^(^)Ei(z) = 1 T{z){3 + 2T{z))/{l - T{z)y'\ 

The coefficient of z"^ in Fi{z) is the sum of k(M) over all multigraphs M on n labeled 
vertices having n + l edges and all vertices of degree 2 or more, divided by n\. For example, 
the coefficient of z is 1/8; this is the compensation factor of the multigraph with a single 
vertex x and two loops from x to itself. The coefficient of 2;^ is || = ||/2!; the smooth 
labeled multigraphs 

(X) CO 00 ^ coo ODO 

12 21 12 12 12 21 

have compensation factors i, i, i, i, and respectively, summing to ||. 

The smooth multigraph M obtained by repeatedly pruning M is called the core of M 
(see [26]). Let -F be any family of smooth multigraphs, and let F be the set of all cyclic 
multigraphs whose core is a member of F. The argument that proves (9.1) also proves that 
the univariate and bivariate generating functions for F and F are related by the equations 

F{z) = F{T{z)) ; F{w, z) = F{w, T{wz)/w) . (9.3) 
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In particular we have Er{z) = Er[T{z)), where Er counts all smooth graphs of excess r 
having no unicyclic components. This relationship accounts for the curious formula (6.11) 
about the last nonvanishing coeflBcient e-rd', we can reason as follows: The minimum number 
of vertices among all graphs of excess r, when (*2^) < r < (*2^), is s, because a graph 
on s — 1 vertices has at most (*2^) edges and (^2^) < s — 1 + r. The coefficient of the 

minimum power of C in -E^ = Er{C/il + 0) therefore comes entirely from the (*^*^"^^^^^^) 
graphs on s labeled vertices having exactly s + r edges. All such graphs are smooth. 

When M has no unicyclic components we can go beyond pruning to another kind of 
vertexectomy that we will call cancelling: If any vertex has degree 2, we can remove it 
and splice together the two edges that it formerly touched. Repeated application of this 
process on any smooth multigraph M of excess r will lead to a multigraph M of excess r 
in which every vertex has degree 3 or more. (A self-loop (x, x) is assumed to contribute 2 
to the degree of x. A vertex with a self-loop will be connected to at least one other vertex, 
because there are no unicycles, so we will never cancel it.) The multigraph M can be called 
reduced. Only the middle two multigraphs of the six pictured above are reduced. 

There are only finitely many reduced multigraphs of excess r. For if such a multigraph 

has n vertices of degrees di,d2, ■ ■ ■, dn, it has n + r = ^ {di + d2 + h d^) > | ^ edges, 

hence n < 2r. The extreme case n = 2r occurs if and only if the multigraph is 3-regular, 
i.e., every vertex has degree exactly 3. We will see later that such regularity is, in fact, 
normal: The complex components of a random graph or multigraph with + o(n^^^) 
edges almost always reduce to components that are 3-regular. 

The reduced multigraph M obtained by pruning and cancellation from a given complex 
multigraph M is called the kernel of M (see [26]). Our immediate goal is to find the 
generating function for all smooth multigraphs M without unicyclic components that have 
a given reduced multigraph M as their kernel. For this it will be convenient to introduce 
another representation of a multigraph M: We label both the vertices and the edges, and 
we assign an arbitrary orientation to each edge, thereby obtaining a directed edge-labeled 
multigraph. Let V = V{M) be the set of vertex labels and E = E{M) the set of edge labels. 
Each edge e E E has a dual edge e, and E is the set of all dual edges. The multigraph M 
is then represented as a mapping M from E U E to V, with the interpretation that each 
directed edge e runs from M(e) to M(e). The dual of e, namely e, is e; thus e runs from 
M(e) to M(e). 

If the vertex labels are 1, . . . , n and if the edge labels are 1, . . . , m, the multigraph 
mapping M takes the set {1, . . . , m, 1, . . . , m} into the set {1, . . . , n}. Any such mapping is 
equivalent to a sequence {xi,yi) . . . , {xm, Um) of ordered pairs generated by the multigraph 
process of section 1, where Xk = M{k) and yk = M{k). 

The number of different mappings M corresponding to a given multigraph M is 
2'^m\ K,{M), where k, is the compensation factor defined in (1.1). This holds because 



32 



is the number of ways to orient the edges and to assign edge labels, and k, accounts 
for duplicate assignments that leave us with the same mapping M. 

Duplicate assignments can be treated more formally as follows. A signed permuta- 
tion cr of a set E and its dual £^ is a permutation of EUE with the property that ae = ae 
for all e. (The group of all signed permutations on a set of m elements is convention- 
ally called the hyperoctahedral group B^; it is the group of all 2"^m! symmetries of an 
m-cube.) Given a multigraph represented as a mapping M from E U E to V, an edge 
automorphism is a signed permutation a of EU E with the property that M{ae) = M(e). 

It is easy to see that the number of edge automorphisms of M is 1/k{M). Such 
a mapping a must be the product of one of the 2"^^^ (ma;a;)! signed permutations of the 
mxx self-loops from x to x, for each x, times one of the (m^jy)! signed permutations of the 
ruxy edges from x to y, for each x < y. Edge automorphisms are the automorphisms of 
multigraphs with labeled vertices and unlabeled edges; this explains why k(M) is used as 
a weighting function for each M in the generating functions we have been discussing. 

We are now ready to prove a basic lemma about multigraphs, motivated by but 
noticeably simpler than the corresponding result for graphs obtained by Wright [41] : 

Lemma 1. If M is a reduced multigraph having v vertices, ji edges, and compensation 
factor K, the generating function for all smooth, complex multigraphs M that reduce to M 
under cancellation is ^ 

Proof. This result is "intuitively obvious," but it requires a formal proof to ensure that 
everything is counted properly in the presence of compensation factors. We assume that 
M is represented by a fixed mapping from edges and dual edges to vertices, where the set 
of edge labels is {[!],...,[//]} and the set of vertex labels is {(1), . . . , (^')}- The dual of 
edge [j] will be denoted by [j] = [—j]- The given multigraph mapping can be represented 
as a function M from {—fi, . . . , —1, 1, . . . , /x} to {!,..., u}, such that edge [j] runs from 
(M(j)) to (M(— j)) and edge [—j] runs from (M(— j)) to (M(j)). Square brackets and 
round parentheses are used notationally here in order to distinguish edge labels from vertex 
labels, although M is a function from integers to integers. 

Let Sn be the coefficient of z'^ in z''/{l — z^. This quantity Sn is the number of 
solutions (tt-i, . . . , n^) to the equation 

ni-\ \-n^ = n — v (9.5) 

in nonnegative integers. Let m — /i = n — v; then m is the number of edges in an n-vertex 
multigraph that cancels to M. 

We will construct 2"^m! n! s„/i/! sequences of ordered pairs {xi,yi) ... {xm,ym) of 
integers 1 < Xj,yj < n such that (a) every constructed sequence defines a smooth multi- 
graph that cancels to M; (b) every sequence that defines such a smooth multigraph is 
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constructed exactly 1/k times. This will prove the lemma, because of (2.2). As noted ear- 
lier, constructing a sequence {xi,yi) . . . {xm,ym) is equivalent to constructing a map M 
from {— m, . . . , —1, 1, . . . , m} into {1, . . . , n}, if we let Xj = M{j) and yj = M{—j). 

The construction is as follows. For each ordered solution (ni,...,np) to (9.5), we 
effectively insert nj new vertices into edge [j], thereby undoing the effect of cancellation. 
Formally, we construct a set of m edge labels 

E^{[j,k]\l<j<i^,0<k<nj} (9.6) 

and a set of n vertex labels 

V = {(i) \1 <i <iy} U {{j,k) \1 <j < II, 1 <k <nj} . (9.7) 

Edge [j, k] runs from vertex (j. A;) to vertex (j. A; + 1), where we define for convenience 

(j, 0) = (M(j)) , U, Uj + 1) - (M(-j)) . (9.8) 

Thus the original edge [j] from (M(j)) to (M(— j)) has become a sequence of rij + 1 edges 
[j, 0] . . . [j, rij] between the same two vertices, with intermediate vertices (j, 1), . . . , (j, nj). 
The dual of edge [7, k] will be denoted by —[j, k]. We also define 

[-j, k] = -[j, n\j\ - k] , {-j, k) = (j, n\j\ + 1 - /c) ; (9.9) 

this means that the original edge [—j] has become the edge sequence [— J, 0] . . . [— J, n^], 
which is the reverse of [j, 0] . . . [j, rij]. Edge [— j, k] runs from (— j, k) to (— j, k 

To complete the construction, let / be any one-to-one mapping from F to {1, . . . ,n} 
that preserves the order of the original labels (1), . . . , {v); and let g be any signed bijection 
from £^ U £■ to {— m, ...,—1,1,..., m}. (A signed bijection is a one-to-one correspondence 
such that g{e) = —g{e).) Then we define 

M{g{\j,k]))=f{{j,k)), (9.10) 

for all [j, k] in EUE. This mapping M corresponds to a sequence {xi, yi) ... (xm, ym) that 
defines a multigraph M on {1, ... , n}, as stated above. We have constructed n! Sn/i^l 

such sequences, since there are 2"^m! choices for g and nl/vl for /, given any solution 
{ni,...,n^) to (9.5). 

It is clear that M is a smooth multigraph on n vertices that cancels to the given 
reduced multigraph M, and that every such M is constructed at least once. We need 
to verify that every mapping M is obtained exactly 1/k, times among the 2'^m! n! Sn/zv! 
constructed mappings. 
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Suppose M has been constructed from ((ni, . . . , n^), /, and suppose a is one of 
the 1/k edge automorphisms of M. We will define a new construction ((n^, . . . , n'^), /', g') 
that produces the same mapping M. Our notational conventions allow us to regard a as 
a permutation of {— . . . , —1, 1, • • • , /i}, where 

= and M{aj)^M{j). (9.11) 

The new construction is defined by 

/' (W) = /((0), i<i<^; 

/'((J,^)') =/((^j''^))' 1<J</^, l<k<ny, (9.12) 
^'(b', A;]') = 9{[aj, k]), l<j<fi, 0<k<ny, 
M'{g'{\j,ky)) = f'{{j,ky), l<\j\<fi, 0<k<n\.^. 

Here (j, /c)' and [j, /c]' are the new vertex and edge labels corresponding to {n[, . . . ,n^); 
they are defined in (9.6)-(9.9). 

It is easy to verify that the definitions in (9.12) imply validity of the same formulas 
for the whole range of j and k values: 

f'{{j,ky)=f{{aj,k)), l<\j\<fi, 0<k<n'+l; 

(9.13) 

g'{[j,k]') = g{[aj,k]), l<\j\<f^, 0<k<n\.^. 
For example, if j > we have 

nm') = /((M(i))) = /((M(i))) = f{{M{aj))) = f{{aj,0)) ; 

f'{{j,n', + 1)0 = f'{{M{-j))) = f{{M{-j))) 

= /((M(cT(-i)))) = /((Af(-aj))) = /((a,,n;. + 1)) ; 

f'{{-j,ky) = f'{{j,n'j+l-ky) = f{{aj,n'^+l-k)) 

= fi{^J,n\.j\+l-k)) = f{{~aj,k)) = f{{a{-j),k)) . 

Therefore if I is any value in {— m, . . . , —1, 1, . . . , m}, we can verify that M'{1) = M{1), as 
follows: There are unique j and k such that I = g{[aj,k]). Hence I = g'{[j,k]'), and 

M'{1) ^ f'{{j,ky) ^ f{{aj,k)) ^M{1) . 

Conversely, if {{n[, . . . ,n'ij), f , g') is another construction that makes M'{1) = M{1) 
for all I, we can reverse this process and find a unique edge automorphism a satisfying all 
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the conditions of (9.12). Exactly v of the vertices of M = M' have degree > 3, since M is 
reduced; these are the images under / and /' of (1), . . . , (i^), and they have the same order 
in M. Therefore f {{i)) = /((z)) for 1 < z < zv. 

_ Let Z = g'{\j,0]). Since M'{1) = f'{{j,0)) = f'{{M{j))) = f{{M{j))), we know that 
M{1) must be a vertex of degree > 3, so there must be a value / (either positive or negative) 
such that I = g{[j', 0]). This rule defines aj = f. We have M{1) = f{{j', 0)) = f{{M{j'))), 
hence M{aj) = M{j). 

Let us say that the edge [j, k]' of M' corresponds to the edge [j', k'] of M if y'([j, k]') = 
g[j',k']. We have defined aj for 1 < j < in such a way that [j,0]' corresponds 
to [crj, 0]. Suppose we know that [j,k]' corresponds to [aj^k] for some k < n^; then 
-[j,k]' also corresponds to -[aj,k]. Also M' {g'{-[j,k]')) = M' {g'{[-j,n'j - k]')) = 
f'{{-j, n'^ - k)') = f'{{j,k + iy) = M'{g'{[j,k + 1]')) is a vertex v of degree 2 in M, which 
therefore equals M(^g{ — [aj,k])) = f{{—crj,n\crj\ — k)). Consequently we have k < n\ 



f'{{j,k + iy) = f{{aj,k + l)), andv = M' {g' {[j, k + 1]')) = M {g{[aj,k + 1])). Now 
[j, k + 1]' must correspond to [aj, k + 1], since there is only one value / ^ ~g'{[jj k]') such 
that M{1) = V. In this way we prove inductively that [j,k]' corresponds to [aj,k] for 
0<k<n'j, and that n'j = nj^^.|. Hence (9.12) holds. □ 

Let F be a family of reduced multigraphs, and let F be the family of all smooth 
complex multigraphs that reduce under cancellation to a member of F. The bivariate 
generating functions of F and F are then related by the equation 

F{w,z) = f{w/{l-wz),z) , (9.14) 

because Lemma 1 establishes this relation in the case that F has only one member. Equa- 
tion (9.14) says simply that every edge in F, represented by w, is to be replaced by a 
sequence of one or more edges, represented hj w/{l — wz) = w + w'^ z + z'^ + ■ ■ ■ ; perhaps 
this means that Lemma 1 is indeed obvious and that the lengthy proof was unnecessary. 
It is, however, comforting to know that a formal verification is possible, when one is be- 
ginning to learn the power of generating function techniques. And somehow, examples of 
multigraphs with numerous self-loops and repeated edges do seem to mandate a formal 
proof, because compensation factors change when edges are manipulated. 

As an example of Lemma 1, let us derive explicitly the generating function Ei{z) = 
Ci{z) for all smooth bicyclic multigraphs. All such multigraphs cancel to a reduced multi- 
graph of excess 1, which can have at most 2 vertices and 3 edges. There are only three 
possibilities, 






, (9.15) 
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having k, = |, |, and |, respectively. Therefore 

S.W = C.W = g^^ + g^^ + j^j^ = ^^j^, (9.16) 

in agreement with (9.2). Wright [41] states that there are 15 connected, unlabeled, reduced 
multigraphs of excess 2, and 107 of excess 3. 

If a reduced multigraph of excess r has exactly 2r — d vertices, we will say that it 
has deficiency d. A reduced multigraph of deficiency is 3-regular; we will call such 
multigraphs clean. 

Corollary. The coefficient Crd in (5.10) and (7.3) is (2r — d)\~^ J2 summed over all 

reduced, labeled multigraphs M of excess r and deGciency d. The coefficient Crd in (8.4) 
can be obtained in the same way, but restricting the sum to connected multigraphs. □ 

This corollary leads to a completely different proof of Theorem 1, because it allows us 
to obtain formula (7.3) for e^d by a combinatorial counting argument. Consider a reduced 
multigraph that has exactly dk vertices of degree /c, for each k >3; then ds + d4^ + ■ ■ ■ = n 
and Sds + Ad4^ + ■ ■ ■ = 2m. We can calculate k{M) over all such M by counting the 
number of relevant sequences (xi, yi) . . . {xm, ym) and dividing by 2'^m!; and the number 
of ways to choose (xi, yi) ... {xrm ym) is clearly a product of multinomial coefficients, 

(2m)! n! 



3!'='3 41^4 ... dg! ^4! . . . ' 

since the first factor is the number of ways to partition 2m slots into dk labeled classes 
of size k for each A;, and the second factor counts the assignments of vertex labels to 
those classes. To obtain all reduced multigraphs of excess r and deficiency we sum 
over all sequences of nonnegative integers (cZs, ^4, . . .) such that X]a;>3 dk = 2r — d and 
X]fc>3 = 6r — 2d, or equivalently 

^(/c - 3)4 = d and ^(/c - 2)4 = 2r . 



fe>3 fc>3 



Let 



k>3 " 



y~](/c — 3)4 = d and ^^(^ — 2)4 = c 

k>3 k>3 



(9.17) 



We have just proved that 
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And we can readily calculate a bivariate generating function for the coefficients frd' 



r,d>0 d3,di,...>0 fc>3 



= n E 

fc>3 dfe>0 



k>3 



where F is the function defined in (7.5). Comparing (9.18) to (7.3) now yields the promised 
proof of (7.4): 

Pd(r)=22-+'^32^-'^(2r-d)!/(20d 

= 22^+'^32^-'^(2r - d)\ [w'^z^''] exp{zF{wz/4:)/6) 
= 22'^+'^32'-'^(2r - d)l [w'^^^r-d] exp(2F(«;/4)/6) 

= 22''(2r - d)\ [w'^z'^''-'^] exp{zF{w/A)) = [w'^] F{wf^-'^ . 

These observations also allow us to express Crd in the suggestive form 

1 f 6r - 2d\ 

^"'^ " 23-rf(3r - d)l \ 2r - j >3 ' ^^^ ^^ 

where {^}>3 denotes the number of ways to partition an m-element set into n subsets, each 
containing at least 3 elements. The asymptotic behavior of the integers 2"^^~'^(3r — d) \ Crd 
will therefore be analogous to the asymptotic behavior of Stirling numbers. 

Lemma 1 captures the combinatorial essence of the generating functions for all complex 
multigraphs. We can obtain a similar generating function for graphs instead of multigraphs, 
but we must work a bit harder, and the formulas are not as attractive. The following 
improvement over Wright's original treatment [41] is based on an approach suggested by 
V. E. Stepanov [36]. 

Lemma 2. Let M be a reduced multigraph having v vertices, edges, compensation 
factor K, and ^xy edges between x and y for 1 < x < y < v. The generating function for 
all smooth, complex graphs G that lead to M under cancellation is 

jr^nS,.), (9.20) 
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where 

P{M, ^) = n ( ^'^^^ n i^-y - ^^-y - 1)^) ) (9.21) 

x=l \ y=x+l / 

is a polynomial in z such that P{M, 1) = 1. 

Proof. We argue as in Lemma 1, but we must restrict the solutions (ni, . . . of (9.5) 
to cases that produce a graph instead of a multigraph. Thus, each Uj that corresponds 
to a self-loop must be > 2, so we use z'^/{l — z) instead of 1/(1 — 2;) in the contribution 
that rij makes to the overall generating function. A subsequence (nj, . . . , rij+yt-i) that 
corresponds to k = n^y edges between distinct vertices x < y must have the property that 
at most one of (rij, . . . , nj_|_fe-i) is zero; hence we use 

^ kz^-^ _ z^-^{k-{k-l)z) 



{1-zf {1-zf-^ {1-z)^ 

instead of 1/(1 — z)'^ in its contribution. The net effect is to multiply the previous generating 
function by P(M, 2). □ 

Replacing z by T{z) gives the generating function for all graphs that prune and cancel 
to M. For example, the generating function Ei{z) — Ci{z) — W{z) of (3.6) can be read 
off from (9.15): It is 

T{zf ^ T{zf ^ T{zY{Z-2T{z)) 
%{1-T{z)f %{1-T{z)f 12{1 -T{z)f 

The degree of the polynomial P{M, z) is the total number of "penalty points" of M, 
where each self-loop costs two penalty points, and where each cluster of ^xy > 1 multiple 
edges between distinct vertices costs ii^y — 1- If M is a graph, the degree is zero and 
P{M, z) = 1. At the other extreme, if all edges of M are self-loops, the degree is 2|U. 

The quantity T{zY / (l ~T{z))^ becomes {1 -\- , when we express it in terms of 
the variable C, = T{z) / (l — T(2)) introduced in section 5; the quantity P[M, T{z)^ becomes 
P{Mj C/ (1 + 0) . If we restrict consideration to connected multigraphs of excess r, we get 
rational functions of C with denominator (1 -|- CY'^'^; this denominator occurs when there 
are (r -|- 1) self-loops in M. However, we have seen in Theorem 3 that the denominator of 
Cr is always a divisor of (1 -|- C)^- There seems to be no easy combinatorial explanation 
for the cancellation that occurs when the contributions of different M are added together. 
Some of the properties of connected graphs are easier to derive by combinatorics, others 
are easier to derive by algebra. 

The actual coefficients of P(M, C/(l + C)) do not make any significant difference 
asymptotically, when graphs are sparse; we will see later that the asymptotic behavior as 
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C — > oo is what counts, hence we only need to know that P(M, 1) = 1. We observed ear her 
that the leading coefficients Cro and Cro of E and E are equal, as are the leading coefficients 
Cro and Cro- Now Lemma 2 shows in fact that each reduced multigraph M makes the same 
contribution to the leading coefficient for graphs as it does for multigraphs. 

10. A lemma from contour integration. Studies of random graphs that have m ~ |n 
edges are traditionally broken into two cases, the "subcritical" case where m < and 
the "supercritical" case where m > |n. It is desirable, however, to have estimates of 
probabilities that hold uniformly for all m in the vicinity of |n, passing smoothly from 
one side to the other. The following lemma, based on techniques introduced in [14], will 
be our key tool for the computation of probabilities. 

Lemma 3. Ifm= |?^(1 + //n~^/^) and if y is any real constant, we have 

2"^ mini UizY-"^ 
(n-m)!n2'^ ^ {l-T{z)Y ~ 

uniformly for < n}/^"^, where B = max(4, | — J/) and 

^) = ^(^nw 22 kir{{y + i-2k)/3) • 

As — > — oo, we iave 
as // — > +00, we iiave 



27r A(y, n^/^-Ve + 0((1 + |//|^K/3-V2) (iq.I) 



e-M^e / 1 4^-3/2 



Moreover, (10.1) can be improved to 



27rA(y,//)n^/3-^/6(l + 0(/n"'/^)) (10-5) 



{n-m)ln^'^^ {\-T{z)) 
if I// 1 goes to infinity with n while remaining < vt}!^"^ . 

Proof First we need to derive some auxiliary results about the function A. If a is any 
positive number, we define a path 11(0;) in the complex plane that consists of the following 
three straight line segments: 

s{t) = { a + it simr/S, for -2a < t < +2a; (10.6) 
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Now we define 

Aiy,f,) = ^[ s'-ye^'^^^'^Us, (10.7) 
where K{iJ,, s) is the polynomial 

(^+^^)^^2^^^^_^ (10.8) 

^ 6 3 2 6 ^ ^ 

Our first goal is to show that A{y,iJi) satisfies (10.2), (10.3), and (10.4). 

To get (10.2), we make the substitution u = s^/3. As s traverses 11(1), the variable u 
traverses an interesting contour F that begins at —oo and hugs the lower edge of the 
negative axis, then circles the origin counterclockwise and returns to — oo along the upper 
edge of the axis. On this contour T we have Hankel's well-known formula for the reciprocal 
Gamma function, 

T{z) ~ 27ri ~uF~ ' 

(See, for example, [18, Theorem 8.4b].) So we can expand (10.7) into an absolutely con- 
vergent series, after substituting Z^/^v}/^ for s: 



f s^-Ve^M ds - / e-exp(l3V3^.^/3)d. 



e-M76 /-^ (132/3;,) Vd« 



3(y+l)/3 ^!^(j/+l-2fe)/3 • 

fc>0 

Interchanging summation and integration, and applying Hankel's formula, gives (10.2). 

To get (10.3) and (10.4), we note first that the integral (10.7) can be taken over 
any path Il{a), not just 11(1), because e^^^'^^ has no singularities. Moreover, we can 
"straighten out" the path 11(a), changing it to a single straight line from a — zoo to a + zoo, 
if a is sufficiently large. For we can readily verify that the integrand is exponentially small 
on any large circular arc s = i?e*^, as |^| increases from 7r/3 to the angle where i?cos 9 = a: 
The real part of ^3 is -R3 cos 36*, which increases from — i?3 to 4q;3 — 3R^a; and the real 
part of lies between —R^ and —R^/2. Hence the real part of K{fi,s) will be at most 
—cR^ for some positive c = c{a) on the entire arc, whenever a > and a > — |a*; this will 
make s^~ye^^^'^^ exponentially small. 

If n is negative, let a = — yu; then 



1 

A{y, -a) = — {a + ity-ye^(-^''^+''^ dt 
27r 
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1 r°° 

27rVQ; J-oo 



and we can find the asymptotic value of the remaining integral by using Laplace's standard 
technique of "tail-exchange" (see [17, section 9.4]): 

If we expand the integrand further, to terms that are 0(q;-'^^'^~^), we obtain 

The method can clearly be extended, in principle, to give a complete asymptotic series in 
powers of ^ beginning as shown in (10.3). 

We also want to know the asymptotic value of ^(y, //) as ^ +oo, and for this we 
need to work a bit harder. A combination of the methods we have used to prove (10.2) 
and (10.3) will establish (10.4). The idea now is to integrate on the path yT^ + it/ ^JJi: 

- t2(i+ ^-2) _ 1^^3^-3/2) 

K{ij,,ij,-^) poo 
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where the last step replaces it by v. We can distort the path of v so that it crosses the 
positive real axis, and then replace by u to get Hankel's contour T again: 

For definiteness we can stipulate that the contour F lies entirely on the negative axis, 
except for a circular loop about with a radius of 1. When u is on the negative axis, say 
u = —t, the quantity y/2u will be —%\f2t on the first part of F and -\-%\f2t on the last, so 
we will have 

gW^, /i) = exp(iFi^(/x-'/2 + - 2t//-2 ± \i{2tfl'^^-^l'^) . 

On the portions of F for which \u\ > the integrand is superpolynomially small;* hence 

/ = / +0(e-'^V2), 

where is the subcontour that runs along the lower edge of the negative axis from — /i*^ 
to the circle u — e*^ and back to —\x'^ on the top edge of the axis. On r[//'^] we have 

and Jp |w~^/^e"| dit exists. Hence 

= f {u-y/^ + ^((1 - + 2«-(^-i)/2))e" du + Oifi'-^) 

= {fm + U(l7m) + r(fa-i)/2) ) ) + • 

The coefficient of /U"^/^ vanishes, because r((y + l)/2) = |(y — l)r((y — l)/2). We can 
use the same method to expand the integrand further, obtaining (10.4). 

* "Superpolynomially small" means that it approaches zero faster than any negative 
power of the argument. 
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Notice that l/V{y/2) or l/V{y/2 - 3/2) may be zero, but not both. Therefore (10.4) 
gives the asymptotically leading term of A{y, ji) in all cases. 

Whew — we have worked pretty hard to establish (10.2)-(10.4), and we still haven't 
begun to tackle the main assertion of the lemma. Fortunately, the work we have done so 
far will help streamline the rest of the proof. The next step is to analyze the factor at the 
left of (10.1); a routine application of Stirling's approximation shows that 



2^ m! n! 

[n — m)! n^"^ 



y2^2"-"^e-^'/6-'^(^l + o(^^-t^^^ , (10.10) 



uniformly for < n^/^^ as n — > oo, when m = ^(1 + ^n~^^^). 

Now we turn to the other parts of (10.1). Equation (3.2) implies that T has an analytic 
continuation in which T[ze~^) — z for 1^1 < 1. Hence, by (3.3) and Cauchy's formula for 
[z'^] f{z), we can substitute r = ze~^ and get 

^ {l-T{z)Y ~2^if (1-T(t))V"+i 

^ i(l_^)i-2/e"'^W^, (10.11) 



27ri 



where 



h{z) = ^ - 1 - - In^ + fl - -Vn (2 - ^) 
n \ n / 



/ ?TZ \ 1 

= ^-l-ln^- 1 In- (10.12) 

V n / 1 — [z — ly 

The contour in (10.11) should keep \z\ < 1. Notice that h{l) = h' (1) = 0; if m = |n we 
also have h"{l) = 0. This triple zero accounts for the procedure we shall use to investigate 
the value of (10.11) for large n. 

Let v = n~^/^, and let a be the positive solution to 

II = a-^ - a. (10.13) 

We will evaluate (10.11) on the path z = e~*^"+*^^'^, where t runs from —Trn^^^ to nn^^^: 

f{z) — =iv _ (10.14) 

It will turn out that the main contribution to the value of this integral comes from the 
vicinity of t = 0. 
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The magnitude of e'^^^^ depends on ^h{z).^\ li z = pe*^, we have 

sfiUpe''^) ^ pcos9 — lnp+- (l - —) InU - 4p cos9 + p^) . (10.15) 

n 2 V n / 

The derivative with respect to 9 is —pg{9) sm9, where 

2(1-^) (2-p)2-2(l-^) 

and g{9) is positive when p = e"'*'', because 2(1 - ^) = 1 - //zv < 1 + cci/ < (2 - e"""")^. 
(We always have < au < 2 when \p\ < n^/^^, and it is not difl&cult to verify that 
(2 — e~^)^ > 1 + X when < a; < 2.) Hence 3?/?.(e~*^"+**)'^) decreases as \t\ increases, and 
Ig^/iC^)! ]^g^g maximum on the circle z = e-("+«*)'^ when t = 0. 
Looking further at nh{e~'^^), wc have the asymptotic estimate 

nhie-'") = \s^+ + 0((//2s2 + s^)v) , (10.17) 

uniformly in any region such that jsi^j < c where c < ln2. This follows from (10.12), using 
the expansion 

In 

1 - (e" - 1) 



In — = + + 0(u'^) , \u\<c. 



We also have 

(1 - e-'^'y-y = s^-yu^-y{i + o{su)) . (lo.is) 

Therefore if f{z) = (1 - zy-ye""^^^^ is the integrand of (10.11) and (10.14), we have 

g-M^e j(g-s^) = u^-yg^-ye^^'"''^ (1 + Oisu) + 0{i?s^u) + 0(sV)) , (10.19) 

when s = 0(n^/-^^). (This restriction on s ensures that p^s^v and s'^v are bounded, hence 
the O terms of (10.17) can be moved out of the exponent.) 
The exponent K{p^ s) in (10.19), when s = a + zt, is 

The real part is bounded above by | — t^, for all a > 0, since 3a~^ — < 2 < a + 
with equality iff ct = 1. Hence the integrand f{e~^^) becomes superpolynomially small 
when \t\ grows, and we have 



e 



'^^^ if{z)-='^^^ r /(e-(«+^*)'^)dt + 0(e-(«+«~^)"^''/3) 

J Z 27r 



,2-y i-a+n^^'^'^i 



27ri 



/ si-j/e^(M.«) ds + 0{y''-yR) + o(e-("+«"')"'^'/3) 



U^-yA{y, p) + 0{v''-yR) + 0(e- -ax(2,|M|)ni/V3^ ^ 



t 3?(a; -\- iy) = X denotes the real part of the complex number x + iy. 
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where s = a + it and 



/oo 
-oo 

The lemma will be proved if we can show that R = 0{l + ii^) and that R/A{y,ii) = 0(//^) 
as — > oo. 

To show that each remainder integral Ri, R2, R3 is small, we will let s = a + iu/P, 
where u = (5t and 

13 = ^/^^^^~^ . (10.20) 

Notice that when < we have a > 1 and a = \n\ + 0(|//|~-^); when > we have 
< q; < 1 and = 11 + 0{ji~^). Therefore in both cases 

/?= |^|V2^(9(|^|-i/2) as|//|^oo. (10.21) 



The first remainder, is 



lU 



/oo a II— a. 7b (■ 
|« + it|2-s'|e^('^'"+^^)|dt= ^ / 
-00 P J — 

If < 0, we have a(3 > -\/2, hence 

0(l)a2-2/ foo / 2-y\ 

Ri<^-^ / max 1, 1 + ^ ]e~'^'^du; 

P J-00 \ v2 / 

and the integral exists, so this is 0{\ii\^/^-y) by (10.21). Similarly, R2 = 0(|//p+V2-j/) 
when < 0, and R^ = 0(|//|9/2-2/). 

On the other hand, when > we have aP < -\/2, and we need to be more cautious. 
Instead of letting t run from —00 to +00 through real values in the derivation above, we 
will distort the path slightly near the origin, so that t passes through the point —i/P and 
so that Ps = aP + iu never has magnitude less than 1. (We used essentially the same sort 
of contour when deriving (10.4).) Then u passes through the point — z, and we have 

- p^-y 7_oo ^ ' ' ^ 

We therefore have Ri = 0(6-^^"/^^/^"^/^); similarly, R2 = 0(e-'^'/V^/^~^/^+^) and 
-R3 = 0(6"^^^/^//^/^"^/^). From (10.4) we know that A(y,fi) grows at least as fast as 
g-M /6^j//2-5/2 ^Yiis case the remainders behave even better than we have claimed 

in (10.5), although the error term 0{ii^/in}^^) is still necessary because of (10.10). □ 
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If we differentiate the integral (10.7) with respect to s and with respect to fj,, we obtain 
a recurrence relation for A(y,ii) and a formula for the derivative: 

{y - 2)A{y, ^i) = f,A{y - 2, ^u) + A{y - 3, ^u) ; (10.22) 
A'{y, /.) = ^A{y - 2, /.) - ^i^^A{y, //) . (10.23) 

(The prime here denotes differentiation with respect to the second argument, /i. The 
derivative with respect to y could also be worked out; but it depends on the derivative 
of the Gamma function in a rather complicated way, and it is not expressible directly in 
terms of A itself.) 

The derivative is more easily investigated if we define 

5(y,/x) = e'^'/6^(y,/x). (10.24) 

Then 

{y - 2)B{y, fi) = fxB{y - 2, fi) + B{y - 3, /x) ; (10.25) 

B'{y,fi) = ^Biy-2,f,). (10.26) 

It is easy to verify that the infinite series of (10.2) satisfies these relations. Repeated 
application of (10.25) and (10.26) leads to a third-order differential equation for B = 
B{y,l^): 

8B'" - AjjL^B" + 2^i{2y - 9)B' - {y - 2){y - 5)B = . (10.27) 

We can see from (10.22) that, for any fixed /U > 0, there are infinitely many negative 
values of y such that A{y, fi) = 0. For if y < and there is no root between y — 1 and y, 
then A{y — 1, /x) and A{y, /x) have the same sign; hence A{y + 2, /x) has the opposite sign, 
and there's a root between y and y + 2. Therefore we cannot use equation (10.5) until |/i| 
is sufficiently large, at least not when y < and > 0. 

Lemma 3 implies the nonobvious inequality A{y, /x) > for all y >0, since A{y, /x) is 
proportional to the limiting value of the coefficients of U{z)'^~'^/(l — T{z)Y , and these 
coefficients are nonnegative. Moreover, A{y, ji) is strictly positive for y > 2 and all ji. For 
if y > 2 and ^(y, /xq) = 0, we have B{y,iJ,o) = 0; but B'{y,ii) > by (10.26), hence we 
must have B{y^ A*) = for all fi < fio, which is impossible because B{y, ji) is a nonconstant 
analytic function of ji by (10.2). 

When y = 1 there is a "closed form" in terms of the Airy function: 

^(1, ^J) = e-^''/^^A.i{|?/A) ; (10.28) 

this is proved in [14, (A. 12) and (A. 19)]. If we differentiate (10.28) with respect to /x, 
taking note of the fact that (10.22) gives 

^(-l,/x) = -M(0,//), (10.29) 



47 



we find 

A{0, fx) = -|/xe-'''/i2Ai(/x74) - e-'^'/i2Ai'(/x74) . (10.30) 
Therefore in particular, 

e^'/i2A(l,//) and e^^'/i^ (A(0, //) + i yl(l, //)) 

are even functions of The well-known relations between Airy functions and Bessel 
functions, 



23/2 



yield the additional formulas 



-^3/12^ /,,3\ 

^'^■''> = ^^''<y ■ <^''-^^' 

^(0, /<) + „) = A(3, „) - „) = '-^^^ (y^) ■ (10-32) 

Since we know A{y, 11) for y = —1, 0, and 1, we can use (10.22) to determine A{y, ji) 
for all negative integers j/, and for j/ = 3 as indicated in (10.32). But a new idea is needed 
if we hope to have a closed form when y = 2. It is possible to express A{2, ji) as an infinite 
sum of Bessel functions, 

A(2,;.) = l(e-^/^ + e-^/-(5:(^l)^(4+,/3(f^)-4+,/3 (10.33) 

but this may be as close to a closed form as possible unless we use general hypergeometric 
functions. Equation (10.33) follows from (10.2) and the hypergeometric identity 

F(| + a, 1 + 2a - 6 - c; 1 + 2a - 6, 1 + 2a - c; 2^) 

^ e-r(a) ^ (-l)^'(2a)W(A- + a)/,+jz) 
^ (1 + 2a - 6)^1 + 2a - c)'=A;! 

[34, equation (2.8)]; here denotes V{x + k)/T{x), and we obtain (10.33) by setting 
z = 11^/12, (a,6,c) = (|,|,l) and (|,|,l). 

The facts that K^/^{z) = 3-'/^n{l_^/s{z) - h/^{z)), K^/siz) = 3-^/^n{l_2/s{z) - 
12/3(2;)), and — Io{z) + 2^j^-^^{—l)'^Ik{z) suggest that we look for an identity of the 
form 



k>0 ^ ^ 

_ ^ akjy) ( ^ / 2fc+2y-l 2k+2y-l //^N 

3(y-2)/3 2^J.(k±y±l^ 1,2 •31/3; 6 ' 3 ' 6 r ^ ' 
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Any formal power series in jj, has such an expansion, for all j/ > — 1. But the coefficients 
afc(y) do not appear to have a simple form except in the cases already mentioned. We have 

/^ 1 y{y - 3) (i/^-i)(i/-6) 

ao(2/)=3, ^^yy' = ^~' a2{y) = g > a3{y) = — 

„ ^ {y-l){y + 2W-Uy + 12) ^ (j; + 3)j/(|/-l)(j/-3)(|/-14) 

4^^; 72 ' ^^^^ 360 

Splitting (10.2) into three sums according to the value of k mod 3 yields a closed form 
for A{y,n) in terms of general hypergeometric series: 

A(y a) = e-^'/' ( ^ f(^ ^-1 ^ ■ ^] 

^Ky.N \^3(2'+i)/3r((y+l)/3) V 6 ' 6 '3' 3' 6; 

^ 1 f^-y 7-y 2 4 //3 



3(j/-i)/3r((y-i)/3) 2 V 6 ' 6 '3' 3' 6 

^ 3(2^-3)/3r((|/-3)/3) 8 V6 ' 6 '3'3'6''' ^ ^ ^ 

11. Application to bicyclic components. Now we are ready to begin using the basic 
theoretical results of the preceding sections. We will start by considering the case when the 
parameter /j, of Lemma 3 is very small, say /i = 0(n~^/^). Then there are m = ^n+0(n^/^) 
edges. 

Theorem 4. The probahihty that a random graph or multigraph with n vertices and 
in + 0(n^/^) edges has exactly r bicyclic components, and no components of higher cyclic 
order, is 



5 \ /2 1 



, „ , ,,+0(n-i/3). (11.1) 
ISy V 3 (2r)! ^ ' ^ ' 

Proof (The special case r = and m = |n of this theorem was Corollary 9 of [14].) 
Consider first the case of random multigraphs, since this case is simpler. If there are 
n vertices, m edges, r bicyclic components, and no components with higher cyclic order, 
there must be exactly n—m+r acyclic components. The probability of such a configuration, 
according to (2.2), is therefore 

n^m I ^n_m + r)\ r! ' ^ ^ ^ 

where U{z), V{z), W{z) are the generating functions (3.3), (3.4), and (3.7). Now 

^^'^ ^ ^ {1-T{z)f ~ ^ {l-T{z)f ^ (1-T(.)) ' ^^^-^^ 
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using the coefficients of (7.20); so we see that W{zy is a polynomial of degree 3r 
in (l — T{z)) ^, with leading coefficient {-^Y ■ Lemma 3 tells us that the leading term 
of W{zY is the only significant one, asymptotically speaking, because the other terms 
contribute at most n~^/^ times as much as the leading term. We can also write 

U{zy = 2-^{1-{1-T{z))y; (11.4) 

this allows us to replace U{zy by in (11.2). Since e^^^^ = (l — T(z)) , the value 
of (11.2) is 

{n — m)\ -\/27rn^ / 5 



(^)'(l + 0(n-/3)) 



(n-m + r)!r!2^ 3'"+V2r(r + 1/2) V24 

This simplifies to (11.1) using the fact that 

(n — m)\ 2^ . ^, 9/q 9 i,x 

= — 1 + 0{rn-^/' + r^n-')) , 

[n — m + ry. ^ ' 

and using a special case of the duplication formula for the Gamma function, 

r(r + i/2) = ^. (11.5) 

On the other hand if we are dealing with random graphs we must replace (11.2) by 

where V{z) and W{z) appear in (3.5) and (3.6). Again we have W{z) = ^(l ~ T{z)) ^ 
plus less significant terms, so W{z) produces an effect similar to W{z). But V{z) = 
V{z) — ^T{z) — ^T{zy; so we now want the coefficient of [z'^] in an expression proportional 
to 



u{zy 



(1 - T{z)) 



3r+l/2 



g-T(z)/2-T(z)74 



which has an exponential factor not covered by Lemma 3. The proof of Lemma 3 shows, 
however, that this exponential factor simply changes the result by a factor of e~^/^ + 
0(n-i/3): We multiply (10.18) by exp(-e-"^/2 - e'^^'/A) = e-^/^ + 0{su). 

Furthermore, (11.6) contains a factor e+^Z^ to cancel the e~^/^, because of (2.4). 
Therefore the leading term of the asymptotic probability for graphs is the same as it was 
for multigraphs. □ 
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Corollary. The probability that a random graph or multigraph with n vertices and |n 
edges has only acyclic, unicyclic, and bicyclic components is 

^ cosh W + 0(n"^/^) fa 0.9325. (11.7) 
3 V 18 

Proof. The sum over r of the estimate made in Theorem 4 clearly gives a lower bound, so 
we must prove that it is also an upper bound. That sum can be written 

2""m!n! 



[z-]Uizr-^U-miz), 



(n — m)\'n?'^ 
where 

r>0 ^ ^ 

If we look at the proof of Theorem 4, and the proof of Lemma 3 on which it is based, we 
see that the calculations all depend on fi[ze~^), where \z\ < and v — n~^l^ . In this 
region, 

|T(;2e-^)| < e"", \\-T{ze-')\>v + 0{v^). (11.9) 

Thus the sum fn-m{ze~^) converges uniformly for all n and all 1^1 < e~'^. Uniform 
convergence allows us to interchange summation and integration. (Notice that the function 
h{z) in the proof of Lemma 3, which influences the behavior of the integrand most strongly 
as n — > oo, is independent of r.) □ 

Another proof of (11.7) will be given below. 

12. Components of higher cyclic order. Now let's consider components that are 
tricyclic, tetracyclic, etc. (Notice that tricyclic components correspond to C2{z), not 
(73(2), in the notation of section 2; our notation has mathematical advantages, but it is 
slightly out of phase with the traditional terminology.) 

Theorem 5. The probability that a random graph or multigraph with n vertices and \n-\- 
0{in}^^) edges has exactly ri bicyclic components, r2 tricyclic components, . ■ . , Vq {q+1)- 
cyclic components, and no components of higher cyclic order, is 

^\ \Z C2 Cq r. ^ ^^„-1/3n 



where r = ri + 2r2 + • • • + qvq and the constants Cj are defined in (8.6). 

Proof. If there are n vertices and m edges, there must be exactly n — m + r acyclic 
components. So we can argue as in Theorem 4 to find 

2"^^!^^! ... U{zr-^+- ^^(,) C^{zY^ C^jz^ ... Cq{zY^ 

C. Co C„ ViTT +0(„-l/3)_ 



ri! r2\ rq\ 3^+V2r(r + 1/2) 
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Formula (12.1) now follows from (11.5) as before. □ 



Let's illustrate the consequences of Theorem 5 by computing the limiting probabilities 
for small values of the parameters (ri, r2, . . . , r^). Here is a list of all configurations with 
ri + r2 + ■ ■ ■ + rq > 1 that occur with limiting probability .000005 or more, showing the 
probabilities rounded to five decimal places: 

[2] = .00263 [0, 2] = .00008 [1, 0, 0, 0, 0, 1] = .00002 

[1,1] = .00105 [1,0,0,0,1] = .00004 [2, 1] = .00001 

[1,0, 1] = .00031 [0, 1, 1] = .00003 [0,1,0,1] = .00001 

[1,0,0,1] = .00010 [3] = .00002 [1, 0, 0, 0, 0, 0, 1] = .00001 

(The notation [2] stands for the case ri = 2, r2 = = ■ ■ ■ = 0; similarly [ri, . . . , Vq] implies 
that there are no complex components of cyclic order greater than q + 1.) 

The sum of these probabilities, .00431, is nicely balanced by ^2/3 plus the sum of 
probabilities when there is only one complex component, i.e., when Vq = 1 and all other 
r's are zero: 

.81650 + .11340 + .03780 + .01547 + .00678 + .00307 + .00141 

+ .00066 + .00031 + .00015 + .00007 + .00003 + .00002 + .00001 ; 



this comes to .99568 = .99999 - .00431. 

Suppose 71 is any countably infinite set of configurations [ri, r2, . . . , r^], where q might 
be unbounded. We would like to prove that a random graph or multigraph with approxi- 
mately |n edges lies in TZ with limiting probability 

J2{P[ri,r2,...,rq] \ [n, rs, . . . , r,] G 7^ } , (12.2) 

where P[ri, r2, . . . , Vq] is the limiting value stated in Theorem 5. The technique we used to 
prove (11.7) does not apply, because the infinite sums over which integration takes place 
might not converge uniformly when q is unbounded. 

However, we are obviously justified in claiming that (12.2) is a lower bound for the 
stated probability, because the sum over any finite subset of TZ yields a lower bound. 

We will prove below that the sum of P[ri, r2, . . . , Vq] over all possible configurations 
[^1) ^2, ■ ■ ■ , Tq] is 1. Consequently, the sum (12.2) must in fact be the limiting probability 
of a random graph or multigraph being in TZ, not just a lower bound. If (12.2) were too 
low, we would not obtain 1 by adding the complementary probabilities P[ri,r2, . . . ,rq] for 
[ri,r2, . . . irq] ^ TZ. This observation will lead to the promised "second proof" of (11-7), if 
we also sum less significant terms to obtain the error bound 0(n-i/3). 
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13. Excess Edges. The notion of "excess" was used somewhat informally in the intro- 
ductory sections of this paper. Let us now define it formally, saying that the excess of a 
graph or multigraph is the number of edges plus the number of acyclic components, minus 
the number of vertices. Thus a (g + l)-cyclic component has excess g, when g > 0. If a 
graph or multigraph has r\ bicyclic components, r2 tricyclic components, etc., then it has 
excess r = ri + 2r2 + Srs + • • • . 

If G and G' are graphs on the same vertices, and if G U C and G fl G" denote the 
graphs obtained by taking the union and intersection of their edges, the excesses satisfy 



r{G) + r{G') < r{G U G') + r{G n G') 



For we can start with empty graphs and insert the edges of G H G' , preserving equality. 
Then if we insert an edge of G\G' or of G' \ G, each side of the inequality increases by 
either or 1; and the left side cannot increase by 1 unless the right side does also. For 
example, if the left side increases by 1 when we add an edge of G\G', the endpoints of 
that edge are in non-trees of G, so they surely are in non-trees of G U G'. 

We have seen in Theorem 5 that the limiting joint probability distribution of the 
random variables (ri,r2, . . .) in a large random graph or multigraph with approximately 
|n edges has the form 

••• ^/(r), (13.1) 
ri! ri\ Tq\ 

where r = ri + 2r2 H V qvq is the excess of the graph and r/ = for Z > g. Indeed, this 

is not surprising, if we look at the problem in another way. 

Let S be the set of all multigraphs of configuration [ri, r2, . . . , r^], and let S{w,z) be 
its bgf. The probability that a given multigraph with m edges and n vertices lies in S is 
then 

P.„„(5) = f^^£nm. (13.2) 

We can also express this as 

PT^mniS) = Prmn(«5 | r) Pr^„(£^) , (13.3) 

where Pr mn{<S \ r) means the probability of obtaining an element of S given that the excess 
is r, and Pirnni^r) is the probability that a random multigraph has excess r: 

[^^m^n] ^) _ [w"^^"]e^("^'^)+'^(^'^)£;^(w,^) 

Prmn(5 I r) - ^^^^^^ ^uiw,z)+viw,z)E^(yj^z) ' ' [w;-^-] G(«;, ^) 

(13.4) 
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Since all elements of S have excess r, we can compute [w'^z'^] S{w, z) with univariate 
generating functions: 



hence 



S(w, z) = e^(»'^)+^(»'^) ^^^^^^y C2{w,zy^ ^ ^ ^ Cq{w,zy'' 

ri! '"" Vql 

^ ^U(wz)/w+V(wz) {wCi{wz)Y' {W^C2{WZ)Y' ^ ^ ^ (wlCgjwz))' 

ri! r^! 
ri! "' r^! 



TT{ \Th~\~T^ — Tlx 

.) = [Z-] ^^^^;_^^, e^(-)^(.) , (13.5) 



if we let 



Similarly 



^^^^ ^ C,{zr^ C^izY- C,{zY'^ 



ri! r^! 



TT( \'Th~\'T' — ?n 

(n + r — m) ! 



A multigraph with m edges, n vertices, and excess r > has t = n + r — m compo- 
nents that are trees (including isolated vertices). Suppose it has rii vertices in complex 
components and uq vertices in trees and unicyclic components. Then 



[^n] ^^vit)s{z) E (i^'^"] t/(^)*e^(^)) ([^-] S{z)) 

Pr^n(5 I r) = "°+^^=" 



. n. E ([^"°] f^(-)*e^^^^) ([.-] EM) 

^ ^ t\ ^ no+ni=n 

= ^ Pr(>S I r, m) Pr^„(ni | Sr) , 
m 



where 



Thus, Pr(5) has been expressed in terms of a simple ratio (13.6), the number of 
multigraphs consisting of precisely rj components of excess j for 1 < j < q, divided by 
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the number of complex multigraphs of excess r. We know from section 9 that there are 
coefficients Sd such that 

(i-r(.))-^i-rw)-'+-^i-rwr- 

Indeed, section 9 tells us that Sd is Yl i^{M) / {2r — d)l, summed over all reduced multigraphs 
of configuration [ri, r2, . . . , r^] having exactly 2r — d vertices. We can also write 



letting s'd = Efc as in (7.22). Therefore, 

n! [z""] S{z) = s'otn{3r) + s[tn{3r -!) + ••• + s'^Mr) , 

expressing the relevant number of multigraphs in terms of the tree polynomials (3.8); and 
(3.9) tells us that 

./oZ n-l/2+3r/2 

n! [z-] S{z) = 4 ^^ij^-— — — (1 + 0(n-i/2)) . 
I 1 y J 23'^/2r(3r/2) ^ ^ 

Similarly, we have 

fnZ n-l/2+3r/2 

nl [Z-] EAz) = <o ^,3,/.r(3./2) + ^^^"'^')) " 
Therefore the ratio (13.6) is 

Pr(5|r,ni) = ^(l + 0(n-^/')); 

and we can sum over ni to get 

Prmn(5) = f 4^ + 0(6)) Pr^,(£,) , (13.9) 

— 1/2 

where e is the expected value of Ui in the probability distribution (13.7). 
Moreover, the leading coefficient is 

4 = ^0 = ^^...^; (13.10) 

ri! r2! r^! 
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and e^Q is just e^, the sum of (13.10) over all configurations [ri, r2, . . . , rg] with ri + 2r2 + 
■ • ■ + Qi^q = 1^- This derivation explains why we obtained a formula of the form (13.1) in 
Theorem 5. 

With graphs instead of multigraphs, the same considerations apply, but we must add 
more terms to the formulas. For example, (13.8) becomes 

Siz) = TT^^ + j\,3.-l +--- + S'sr + S'sr+l (l ' Az)) + ^.+2 (l ' T(.)) ' • 

(13.11) 

The leading coefficient s'q is the same as sq, so the asymptotic behavior is the same as 
before, if we assume that m is large enough to make the expected value of approach 



zero. 



We can estimate the expected value of n^^ by finding the expected value of 

[.".lS(.)-(V,)£,i£). (13.12) 



indeed, this expected value is the true error in the approximation (13.9), so it is even more 
relevant than the expected value of . Since S{z) — {sQ/er)Er{z) can be expressed as 
{s[ — e'^iSo/cr) /(I — T{z)) plus less significant terms, the desired expected value times 
Pirnni^r) is obtained by applying Lemma 3 as we did in the proof of Theorem 4, but with 
3r replaced by 3r — 1. The result, when m is near is proportional to 

The expected value of can be computed if we replace S{z) by d^Er{z) in these 
formulas, because [z"^] -d^Eriz) = n'^[z'^] Er{z). This has the effect of changing the leading 
term from e^/ (l - T{z)) to (3r) (3r + 2) . . . (3r + 2k- 2)e^/ (l - T{z)f'^^^'', so the result 
when m is near |n is proportional to v?^!"^ . We have proved 

Corollary. If m = |n(l + fxn~^/^) and < n^/^^, the kth moment E^nini \ r) of the 
number of vertices in complex components, given that the total excess is r, is 

'''^^ ',2k/3ll'li\2^y '^'{^ + 0{^) + 0{n-'/^)) , if/. = 0(1); (13.13) 

2fc/3 

aur^^{l + 0{\lJi\-^) + 0(//%-i/3)) , if -oo; (13.14) 

A* 

2fe/3 k 'pf^f-L. 2.\ 

^kr "" ^/ ^,3^^ 1 ^ (1 + 0{^,-') + Q(/n-^/^)) , if/. ^+oo; (13.15) 
here atr = (3r)(3r + 2) . . . (3r + 2/c - 2). 

Proof These expressions are akr times the ratios of formulas (10.2), (10.3), and (10.4) 
when J/ = 3r + I + 2A; to their values when y = 3r + | . □ 
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Notice that when m is approximately |n — n^/^, the probable value of ni is propor- 
tional to _ ^1/2. -^jien m ^ + n^/^, it is proportional to _ ^3/4_ 
These are the extreme cases |//| = n^/^^ at the limits of Lemma 3's range. 

We can use formula (13.3) whenever <S is a collection of multigraphs whose complex 
components have total excess r. We can use formula (13.6) whenever S also places no 
restriction on its non-complex (acyclic and unicyclic) components. For example, we can 
determine the conditional probability that a random graph with |n edges has a bicyclic 
component of each of the three types in (9.15), given that it has excess 1. The generating 
functions S{z) for the three cases are respectively |T/ (1— T)^, (1— T)"^, j^T'^/ (1— T)^; 
so the respective conditional probabilities are 

0(n-V3), 3 o(n-^/3), l + 0{n-'/'). (13.16) 
5 5 

All probabilities that are conditional on excess r must, of course, be multiplied by 
Pirnni^r), the probability that a random multigraph has excess r. Lemma 3 and the 
method of Theorem 5 make this easy to compute: 

Corollary. A graph or multigraph with m = |n(l -|- fin~^/^) edges and n vertices has 
excess r with probabihty 

PTmniSr) = V2^ Cr A{3r + ^ , fx) + O (^^^^^ , (13.17) 

uniformly for |//| < n^/^'^ as n — > oo, where Cr = Cro is given by (7.2) and A{y,iJ,) is 
given by (10.2). When /j, — > — oo, the probability is 0(|//|~^^); when ji — > -|-oo it is 

0(^3r/2g-M76). □ 

(The special case r = in (13.17), without the error bound, was found by Britikov [9], 
who proved that a random graph has excess with probability approaching -\/27r^(|, //), 
for fixed as n — > oo.) 

Here is a table that shows how the probabilities of having excess r change as the graph 
or multigraph evolves past the critical point m = |n: 







r = 


r — 1 


r = 2 


r = 3 


r = 4 


r = 5 


r = 6 


r = 7 


r = 8 


r = 9 


r = 10 


^ = 


-3 


.994 


.006 


.000 


.000 


.000 


.000 


.000 


.000 


.000 


.000 


.000 


1^ = 


-2 


.983 


.015 


.001 


.000 


.000 


.000 


.000 


.000 


.000 


.000 


.000 


11 = 


-1 


.947 


.043 


.008 


.002 


.000 


.000 


.000 


.000 


.000 


.000 


.000 


II = 





.816 


.113 


.040 


.017 


.007 


.003 


.001 


.001 


.000 


.000 


.000 


II = 


1 


.475 


.179 


.115 


.077 


.052 


.035 


.023 


.015 


.010 


.007 


.004 


11 = 


2 


.100 


.082 


.085 


.086 


.084 


.079 


.073 


.066 


.058 


.051 


.043 


11 = 


3 


.003 


.004 


.007 


.010 


.013 


.017 


.020 


.024 


.028 


.031 


.034 
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The mean excess is approximately .308, 1.544, 6.364, 19.009, for = 0, 1, 2, 3. 

In this paper we are interested mainly in graphs or multigraphs with approximately 
|n edges, but it is instructive to consider also the formulas that arise when m is somewhat 
smaller. The excess is then almost surely zero. In fact, we can obtain a formula that 
has a much better error bound than (13.17), in the case r = and < —n~'^: If we set 
A = 2m/n, and if m < |n — n^/^"'"'^, the probability of excess can be shown to be exactly 



2-m!n! ^^^^ Uiz] 



n—m 



(n-m)!n2- (l-T(z))^/' 

SM^^ <f(l- 1 + eMn,A,t)-.72 , (13.18) 

27r S(n -m) J \ 1-Xj \ Xj ' ^ ' 



V27TS{n-m) 

where 



5(n) = ^-= = l + -) , (13.19) 



h{n, A, t) = nh{\ + it(3) - nh{\) + (13.21) 

_n^m'({-^f 1 \ 

- 2 A. k \ A^-i (2 - A)^-i ; ' 

and the contour of integration makes z = X + itf3 traverse the circle \z\ = A as t varies. The 
function h{z) in (13.21) is the function defined in (10.12). We are essentially simplifying 
the proof of Lemma 3 by choosing a path of integration through the saddle point 2; = A, 
as in the proof of Theorem 4 in [14]. The proof of that theorem justifies restricting t to a 
neighborhood of zero, so that the tail-exchange method can be applied as in the derivation 
following (10.9). It follows that the probability of excess is 1 — 0{p?/{n — 2m)"^) for m 
in the stated range. We have in fact the estimate 

Prmn(^o) = 1 - ^ a-'(l + 0(a-3) + 0{an~^''')) (13.23) 

when m = \n{l — an~^/^)^ uniformly for (Inn)^ < a < 

It is interesting to note that the tail-exchange method can be used to extend (13.23) 
to an asymptotic series in and avT^^"^, although the integral (13.18) actually diverges 
if we let t run through all real values from —00 to +00 instead of describing the stated 
contour. Indeed, the magnitude of the integrand in (13.18) for large real values of \t\ is 
approximately |i|"-2»n-i/2_ 
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14. Probability distribution of the excess. One way to check our calculations is to 
verify that the probabilities in (13.17) sum to 1. Thus we want to prove that 

k\ ^ 25'^33-(2r)!(3r)!r(r+l/2-2/c/3) ^ 
The inner sum is a hypergeometric series whose sum is known; 

' ^"v^ .... (14.2) 



(14.3) 



r(l/2-2/c/3) V6'6'2 3 ' 2; T{{1 - k)/3)T{{2 - k)/3) ' 
Indeed, the special hypergeometric 

f{a,b,z) = F{a,l-a; b; z) , 
which is related to a Legendre function, satisfies 

m "r(i(a + 6))r(i(l-a + 6)) ' 
This well-known relation can be obtained by applying Euler's identity 

F{a, b; c; ^) = (1 - 2)^"""''F(c - a, c - 6; c; z) 

and Gauss's identities 

F(a, b; c; 1) = (r(c - a - b)T{c)) /(r(c - a)T{c - b)) , 

1 1 
F(2a, 2b;a + b+-;z) = F{a, b;a + b+-; Az{l - z)) , 

which can be found, for example, in [17, (5.92), (5.111), exercise 5.28]): 

(1 - zY-^F{a, 1 - a; 6; 2) = F(6 - a, 6 + a - 1; 6; z) = F{\b- \a, \b^\a-\;b; 4z{l - z)) ; 

we obtain (14.3) by letting z ^ ^. 

The sum (14.2) vanishes except when k = 3m, and in this case the A;th term on the 
left of (14.1) reduces to simply (/i^/6)"ym! because of the formula 

r(i-m)r(|-m)=3'"-'/^2.^. (14.4) 

Hence (14.1) is true. It is remarkable that so much of nineteenth century mathematics has 
turned out to be relevant to the study of random graphs. 
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When = 0, the generating function for the hmiting probabihties of excess r turns 
out to have a cfosed form: It is 



sr^ [AY [2err\z^ f^^fl 5 1 z\ 2 cos ( | arcsin ) 

gu) V5w^v5n5'i^5^5)= — — ■ 

From this expression it is easy to calculate the limiting value of the mean excess when 
m = in, namely i — 3""^/^ ~ 0.308. The variance, similarly, is || — 3~^/^. 

The limiting mean excess when the number of edges is |n(l + nn~^/^) does not seem 
to have a simple closed form, although we can express it as a hypergeometric series and 
find the asymptotic value. Suppose we insert the factor z'^ into the left-hand side of (14.1). 
Then the left-hand side of (14.2) becomes 

' (14.6) 



r(l/2-2/c/3) V6'6'2 3 '2/ r(l/2 - 2/c/3) ' 2 3 '2 
To evaluate the derivative of such a function at i, we can use the identity 

z{l - z)f{a, b, z) = [az + ^— f— ^) /(a, 6, z) - ^— |— ^ /(-a, 6, z) , (14.7) 

which is readily verified by checking that the coefficients of z"^ agree on both sides. To get 
the mean value of r, we want to differentiate (14.6) with respect to z and set z — 1; and 
according to (14.7), this is equivalent to replacing (14.6) by 

r(l/2 - 2A;/3) t)"^^6 ' 2 §" ' 2) ~ ^3 T^f^~6 ' 2 ~ T ' 2)) • (^^-S) 

Again, /(|, ^ — i) vanishes unless k = 3m. The contribution to the mean from 
this half of (14.8) is just what we had when we were summing the probabilities, but with 
an additional factor of (^ -|- ^); so it is 

m>0 

The other half of (14.8) is, however, more complicated, since all values of k make a contri- 
bution. According to (14.3), we want to evaluate 

^ [2^ (^3VV)^ 2V^+^^/3v/^(l/3 + 2fc/3) _^ 

^V3 k\ r(l/6-A;/3)r(5/6-A;/3) ^0 + ^1 + ^2, 
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where T,j is a hypergeometric series corresponding to A; = 3m + j: 

1 ^ /5 7 1 2 ^3 



^° 3^3 ^U' 6' 3' 3' 6 ; ' 



^_^_^^v^^/7 3 2 4 ^3- 
^ V3 6V3r(5/6) Ve ' 2 ' 3 ' 3 ' 6 J ' 



2 62/3 r(l/6) V2 ' 6 ' 3 ' 3 ' 6 J ' 
As 2; — > +00, such hypergeometric series satisfy the asymptotic formula 

F(a, 6; c, d; z) = ^^SS ^'^^ ( 1 + + ^ " ^) " + + ) , (14.11) 



r(a)r(6) 

where 5 = a + b — c — d; this follows by plugging the right-hand side into the differential 
equation 

^{^ + c-l){-& + d- 1)F = z(-d + a)(-d + h)F 
satisfied by the left. We obtain 



_ ,3 

e 



.7»E - 1 r(|)r(|) fir' 1 , 



,-.'/6v _ 1 r(|)r(|) 1 3 \ . 
,-."/6s _ 5v/3 n|)r(|) 1 X 

therefore e-'^'/6(So + Si + S2) = (| - | - |) + i + 0(//-3)). Subtracting this from 
(14.9), and using computer algebra to refine the estimate further, gives us the answer we 
seek: 

Theorem 6. The expected value of the excess, when there are |n(l + fxn~^/^) edges, 
approaches 

+ 1 + + ^f^-' + 0{f.-') , (14.13) 

for fixed //>5>0asn^ 00. □ 

This method of calculation shows also that the variance will be 0{ii^) and the fcth 
moment will be 0{iJ,^^); each derivative of (14.6) can do no worse than multiply by /z^, 
because of (14.7). 
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Incidentally, the 0{iJ, ^) terms in all three equations of (14.12) turn out to equal 

-3 , 15 
48^ ^ 32l 



j^fi ^ + 0{fi ^), and this is no coincidence. We have, in fact. 



r(i)r(i)^^5 7.1 2. \ r(|)r(|) /7 3.2 4. 



r(i)r(^) V6 ' 6 ' 3 ' 3 ' ; r(|)r(^) Ve ' 2 ' 3 ' 3 



r(|)r(f) V2' 6 ' 3' 3 

in the sense that all three functions have the same asymptotic series ^ SkZ~^'^e^ as 
2; — > 00. This follows because all three functions satisfy the same differential equation, and 
because their asymptotic behavior depends only on the differential equation except for a 
constant of proportionality. It is well known that the general hypergeometric functions 
F{ai, ...,am;bi,..., z)/T{bi) . . . r(6„) and z^-''^F{ai + 1 - 61, . . . , + 1 - 61; 62 + 
l-6i,...,6n + l-6i,2-6i; z)/T{b2 + l-bi) ... r(6„ + l-6i) both satisfy the differential 
equation + 61 - 1) ...(?? + 6„ - 1)F = z{'d + ai) ...{•& + am)F. In the case of 
(14.14), even more is true: The three asymptotically equivalent functions shown there can 
be written respectively as \{G{z) + G{(jjz) + G{(jJ^z)), \{G{z) + uj^G{ojz) + ojG{oj'^z)), 
I {G{z) + LoG{ujz) + uj'^G{uj'^z)), where 



1 ^ r(3/2 + fe)^fe 
- 71 ^0 r(i/2 + /./3)/.! ^^'-^'^ 



and a; = e^'^*/^. 



15. Deficiency, planarity, and complexity. The calculations in the preceding section 
can be combined with the structure theory of section 9 to yield the following general result. 

Theorem 7. Let M he a reduced multigraph of excess r and deGciency d, i.e., a reduced 
multigraph having 2r — d vertices and 3r — d edges. The probabihty that the complex part 
of a random graph or multigraph reduces to M is asymptotically 

when there are |n(l + fin~^/^) edges and n vertices, |//| = o(n^/^'^). Here k(M) denotes 
the compensation factor (1.1), and A{y,iJ,) is defined in (10.2). The sum of (15.1) over 
all M of deficiency is 1. For each d > 0, the probability that a random multigraph has 
deficiency > d is 0((1 + /i^)''n~^/^), uniformly in n and ji. 

Proof When d = 0, this theorem is a consequence of the corollary following (9.16), together 
with (13.17) and (14.1). 

When d> 0, (15.1) is clear, but we need two auxiliary results of independent interest 
before we can prove the desired uniform estimate. 
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Lemma 4. Let -E^(>d) denote the generating function for all complex multigraphs of 
excess r whose deficiency is at least d. Then 

CrdT {2r-d- l)erdT ^ CrdT 

(1 - T)^r-d (^l _ y^Sr-d-l - ^r{>d) - (1 _ y^Sr-d ' [i-O.Z) 

where inequality between generating functions means that the coefficients of every power 
of z obey the stated relation. 

Proof. The claim is trivial when d = 2r — 1; and it is true when r = 1, because Ei = 
^ T(l — T)~^ — T(l — T)~^. The lower bound is easily seen to be a lower bound on 
erdT^^~'^/{l - T)3^-^ itself. 

The proof of the upper bound now proceeds by induction on r. Let 

d-l 

K = Y1 ^rk{i + cre''-^ + e.dC(i + 0"^-'^-' , (15.3) 

in the notation of section 5. We want to prove that < E'^; it suffices to show, by (5.8), 
that 

(r + (1 - Tiz))^)El = (r + (1 + C)-^^)^^; > | (| C(l + C) + ^Te',_, , (15.4) 

considering both sides as generating functions in powers of z. Proceeding as in (5.11) and 
(5.12) to form 

A'r = (I C(i + + ^)K, K= (I C(i + C) + , 

a bit of algebra shows that when < d <2r — 3 we have 
{r + il + C)-'^)E',-lB'^_, 

^C(l + 0^+' (E C''-'-'-'((«fc + Pk)erd - ilk + Sk + ek)e^r-i)d) 

^k>0 

-(2r-l- dfe^r-i)(d-i)e'-''-^^ , (15.5) 

where 

„/2r-d-2\ ^ , j2r-d-2 

ak = 2{2,r-d){^ J, (3u^2{r + l)[^ ^ 

7. = (3r - 1 - d) (3r - f - d) ('^ ' , 4 = (9r - f - 3d) (^^ ' ' ^) 



2r - d - 3 
k-1 
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Obviously Pk^rd ^ efee(r-i)d) since Crd > ^{r-i)d- -^iid the inequality 9r — ^ — 3ci < 
(3r - I - d)(3r - f - d) for < d < 2r - 3 yields 

In fact, (5.11)-(5.13) imply that 

2(3r - ci)ej.d > (3r - | - (3r - | - d)e(^r-i)d + (2r - d)(2r - 1 - cZ)e(r_i)(d_i) ; 

so (15.5) is a polynomial in C with nonnegative coefficients, and thus a power series in z 
with nonnegative coefficients, proving (15.4). The case d = 2r — 2 needs to be handled 
separately, but it offers no difficulty. □ 

Lemma 5. There exists a constant e > such that, for every fixed d > 0, a random 
multigraph with n vertices and m = ^(1 + jJin~^f^) edges has excess r and deficiency > d 
with probabihty 



3 



\0(n-^/3e-^'^), ifr>ii^, 
uniformly in n, r, and ji when ji < ■n}!^'^ . 

Proof. Let Prd = Prdip-i A*) be the stated probability. It suffices to prove the lemma when 
> 1 and r > 1. For if r = d = 0, the result follows from Lemma 3; and pod = when 
d > 0. On the other hand, if < 1 we have Prdi'^i A*) ^ Y^'jLrPjdi'iT'j !)• 

Using Lemma 4 and arguing as in the proof of Lemma 3, equation (10.11), we obtain 

Prd = 5^ [z 7 — ^ e £;^(>d) 

^^m _ ^ _|_ ^j! v_ y 

2"^m!n! Jjn-m+r g^^y 

- ^ (n-m + r)! (1 - T)3^-'^+V2 

(n-m + r)!n2'"27rz / V 1 - -^^ / 

with /i(2;) as in (10.12), and where the integral is taken around a circle z = pe*^ with 
< p < 1. On this circle, both |(2 — 2;)/(l — 2;)| and |1 — 2;|~-'^ attain their maxima at 2; = p. 
Moreover, by (10.16) we have ^3?/i= -p£f(^) sin ^, where £f(^) > ((2-p)2-l)/9 > |(l-p); 
therefore 

^h{pe'^) < h{p) + ^(1 - p)p{cos9 - 1) < h{p) - ^p{l - p)9^, for |^| < tt. 
9 Qtt 
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Now Prd = if d > 2r, because we are assuming that r > 1. Hence d — 2r + 1/2 < 0, and 
the contour integral including the factor l/(27rz) is less than 

In the following argument, unspecified positive constants will be denoted by ei, 62, 
. . . , while positive numbers that may depend on d will be denoted by Ci, C2, . . . . Let 
v = n~^/^. If we apply (10.10) to the coefficient in front of the contour integral, and if we 
use the estimate 



(n — m + r)! f^Y \r 



> {n-my ^ [^J {1- HuY > {^J e- 



(n — m)\ 

which is valid when /iv < ^, we obtain the upper bound 

where p is any number between and 1. 

Suppose now that r < 12p^, and set p = 1 — ^fif, r = ^xp,^. If C = 0(1), we have 

n/i(i - = ^eV + ^eV + 0(1) 

as in (10.17). Therefore, since p(2 — p) < 1, 

Prd < C2e,dn-'^(e//z/)"-=''^e^''^'/3+«V72-M76 

^^^^d-i/2^-d/3^^^)dgfc(x,€)M76^ /c(a;,0 = 2e^ + 3e'-l + 4xln(^-|3^ , 

by (7.16) and Stirling's formula. Given x between and 18, we minimize k{x, ^) by letting ^ 
be the positive root of ^^+^^ = 2x] notice that this makes ^ < 3, justifying our assumption 
that ^ = 0{1). The minimum k{x,^) satisfies 



A;(x,0 = e'-l + 2(e' + e')ln 



1 + 



2 
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We also have |^ — 1| > eija; — 1|, hence k{x,^) < —e2{x — 1)^. Our estimates have shown 
that 

when r = ^x/i^ < 12//^, so the first half of the lemma has been proved. 

When r > 12//^, let us set p = I — r] and r = yn. In this case we will in fact prove 
the lemma for a much larger range of fi, assuming only that fiv < S, when 5 is a suitably 
small constant. If 5 < | we can assume that < y < |, since m is at most ^-^n and since 
Prd = when r > m. Using (7.16) and (15.6) again, we find 

Prd < C6r''-l/2r7'^e'^'(^'^)-'^'/^+2^'^^ 

where 

%,^)=2/ln(^^^^)+/^(l-r/) 

= 2/ In ( — J - ln(l - ry) + — - — ln(l - ry ) . 

Given y, the minimum value of l{y, rj) occurs when y = rf'{r} + jiv) /{Z — rf). However, 
we do not need to find the exact minimum, in order to achieve the upper bound in the 
lemma; it will suffice to be close to the minimum when y is small. Therefore we choose 77 
in such a way that the calculations will be relatively simple: 

With this choice, we always have f] < ^; and 

Kv^ V) = fiv) = ~ 3(-i _^2-) ~ ^ ~ ^^(1 -V) + — ^ — ln(l - ry^) . 
If we set r] = fiv, this function f{r]) reduces to 



00 



fc=i 



2k 2k + 1 3 J 6 6n 



1 1 2\ r]^ fi^ 



On the other hand, the actual value of ry must be larger than 2/11', because 2/11' is too small 
to satisfy (15.7): 



2(2uuY 16(uuY 400//"^ r 



3(1 - (2/.z.)2) - 3(1-^) 63n 



n 



When r] > fiv we have 



X ry(yy3 + 37yV + 3(y/-H) ^ . x ^ . .2 
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hence when rj satisfies (15.7) we have 



We have proved that 

Prd < C7r%'^/^e-'^''+^''^'', 

and this is at most Csn^'^/^e"^^^ if S is less than ^€4. □ 

Returning to the proof of Theorem 7, its final claim now follows for /j < n^/^^ by 
summing the upper bounds of Lemma 5 over all values of r. The claim is trivial when 

As remarked earlier, the fact that (15.1) sums to 1 allows us to compute asymptotic 
probabilities of any collection of graphs or multigraphs obtained as a union over an infinite 
set of reduced multigraphs, as long as at least one multigraph in the set is clean (has 
deficiency zero). We simply sum the individual probabilities, neglecting unclean cases. 

One corollary of Theorem 7 is the fact that a random graph with ^{n + jjn"^^^) edges 
is clean with probability 1 — o(l) whenever jj, = o(n^/^^). Stepanov proved this for fi < 
[36, Theorem 3] and conjectured that it would also hold for positive His conjecture was 
proved for all fixed fx by Luczak, Pittel, and Wierman [28]. 

Erdos and Renyi remarked in their pioneering paper [13, §8] that, if x is any real 
number, the probability that a graph with + xn^/^ edges is nonplanar "has a positive 
lower limit, but we cannot calculate its value. It may even be 1, though this seems 
unlikely." They gave no proof that the limiting probability is positive, and their remark 
was embedded in a section of [13] that contains a technical error (see [27]); but a proof of 
their assertion was found later by Stepanov [36, Corollary 2 following (10)]. In the other 
direction, the fact that nonplanarity occurs with probability strictly less than 1 follows 
from the fact that a graph with |n + o(n^/^) edges has excess with probability y^, as 
observed in [14, Corollary 8] . 

We are now in a position to make a more precise estimate of the probability in question. 

Theorem 8. The probability that a graph with in+o(n^/"^) edges is nonplanar approaches 
a limit p as n — > 00, where 

0.000229 <p< 0.012926. (15.8) 

Proof. The condition m = + o(?i^/^) is equivalent to saying that = o(l) when m = 
|n(l + iJ,n~^^^), so we can let = in the asymptotic formulas above. By Theorem 7, the 

constant p is the sum X) V27fyl(3r + i, 0)K(M)/(2r)! = E (l)"^^' «(^)/(2r)P over aU 

nonplanar, reduced, labeled, clean multigraphs M, where r = r(M) is the excess of M. 
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A clean multigraph cannot contain a subgraph that is homeomorphic to the complete 
graph K^, i.e., a subgraph that cancels to K^, because has deficiency 5. Adding an 
edge to any multigraph increases the excess by or 1 and increases the deficiency by 0, 1, 
or 2 (sec section 20 below for further discussion); hence all subgraphs of a clean multigraph 
are clean. Indeed, this argument implies that a random graph with |n + o(n^/^) edges has 
probability 0(n~^/^) of containing a K^. 

Therefore, if a sparse graph or multigraph is nonplanar, its nonplanarity comes almost 
surely from a subgraph that cancels to the complete bipartite graph i^3,3, which is clean 
and has excess 3. 

One way to obtain bounds on p is to restrict consideration to reduced multigraphs 
whose components all have excess < 3. If such a multigraph contains a K^ ^, it corresponds 
only to nonplanar graphs; if it does not, it corresponds only to planar graphs. The dif- 
ference between the upper and lower bounds so obtained is the probability that a random 
graph of |n + o(n^/'^) edges has at least one component of excess > 4, i.e., that at least 
one component is more than tetracyclic. 

The multigraph K^^^ has compensation factor 1, because it is a graph, and its vertices 
can be labeled in = 10 different ways. Thus it contributes only ^ = ^ to the 

constant C3 = that accounts for all clean connected multigraphs of excess 3. 

Let fr = [z'^] ex.p(ciz + C2z'^ + csz^) and Qr = [z'^] exp(^ciz + C2z'^ + (03 — ■^)z^). Then 
the quantities 




are respectively the probability that a sparse graph has all components of excess < 3 
and the probability that, moreover, no component cancels to -^"3,3. These series converge 
rapidly and lead to the numerical bounds p — q and 1 — g in (15.8). □ 

It is interesting to study the expected number Eni of vertices in complex components, 
as a function of fx, because it will be the expected number of vertices in the giant compo- 
nent when jj, increases. We have Eni = E(ni |r) Pr(£r)- By (13.17) and the remarks 
preceding (13.13), each term in this sum can be approximated, to within relative error 
0((1 + /U'*)n~^/^), by 3r^/2nerA{3r + ^,fi)n'^^^. Let us, for simplicity, assume that jj is 
bounded. Then the proof of Lemma 5 is easily modified to show that the rth term of 
the sum is 0(n^/"^(r + l)e~'^''), uniformly in n and r. Thus, by dominated convergence, 
E'T'i = (/(/^) + o(l))n^/"^, where 

/(^) = ^3r^e^A(3r+ §,//). (15.9) 

r>0 
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Equation (10.23) tells us that 



^^^fi^^) + f'{^^) = k E.>o 3rV2^e, A{3r + ^ = | 9{^^) , (15.10) 



i ..2 
2 



where g{iJ,) is the expected value of r; we calculated gr(/x) in the discussion leading up to 
Theorem 6. Thus, we obtain the estimate 

/(/.) = 2/x - /.-^ - f f.-' -^f^-' + 0{n-'') , (15.11) 

for > 5 > 0, by combining (15.9) with the asymptotic formula for g(fi) in (14.13). 

We can express /(/i) in "closed hypergeometric form" by proceeding as in (14.9) and 
(14.10). The result is 

jyl^> 37/6r(|) +^ 4^^ 1^3' 3' 6 



37/6^(1) V2 ' 6 ' 3 ' 3 ' 6 
// ^ /5 7 2 4 ^3 



2^3 V6 ' 6 ' 3 ' 3 ' 6 

3V%^^fI 3.4 5.//- 
27/3r(f) U' 2' 3' 3' 6 ' ' • > 

It is instructive to compare this expression with alternative formulas for the same quantity 
obtained in [28] by a different method: 

^^^^ ^ ^/"'(iJ^^^''^')"''^"'^'^'' 

= ^+^ \- dx-- / e^^^-'^^dx. (15.13) 

V2^ Jo a;3/2 4 Jo 



Here G{x, /i) = ((^u — x)^ — and ^^/^ is Wright's asymptotic estimate [44] 

for the number of connected graphs with excess r. 

16. Evolutionary paths. Consider any graph or multigraph that evolves by starting out 
with isolated vertices and then by acquiring edges one at a time. Initially its excess is 0; 
then each new edge either preserves the current excess or increases it by 1. We observed 
in section 4, following (4.7), that a new edge augments the excess if and only if both of its 
endpoints are currently in the cyclic part. We observed in section 13 that many interesting 
statistics about random graphs can be usefully represented in terms of probabilities that 
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are conditional on the graph having a given excess. Therefore it is natural to look more 
closely at the way a graph changes character as its excess grows. 

Every evolution of a graph or multigraph traces a path from left to right in the 
following diagram, which shows the beginning of an infinite partial ordering of all possible 
configurations [ri,r2, ■ ■ ■ ,rq]: 

f 7029504 N 
^ 7436429 ' 

[0,0,0, 1] 




Figure 1. The evolution of complex components. Each configuration 
■ ■ ■ 1 Tq] stands for a graph or multigraph with r\ bicyclic compo- 



nents, r2 tricyclic components. 



(g + l)-cyclic components. As 



a graph evolves, its excess ri + 2r2 + Srs + • • • increases in unit steps, 
and the configurations follow a path from left to right in this partial 
ordering. 



When complex components begin to form, they follow a path in this diagram, with the 
indicated transition probabilities. The upper path is followed most frequently; on this 
path there is a unique complex component that will become the "giant." Parenthesized 
ratios are the probabilities of reaching a given configuration. At the moment the excess 
first reaches 2, the configuration must either be [0, 1] (one tricyclic component) or [2] 
(two bicyclic components). When the excess goes from 2 to 3, we go either from [0,1] 
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to [0,0,1] or [1,1], or from [2] to [0,0,1], [1,1], or [3]; and so on. Each configuration 
[ri, r2, . . . ,rq] corresponds to a partition of the excess r = ri + 2r2 + • • • + qvq. The 
fraction in parenthesis shown above each configuration in Figure 1 is the hmiting prob- 
abihty Ci^c^^ . . .CgY(ri!r2! ... rqlcr) that a random graph of excess r has configuration 
[^1, T2, • • • , Tq]. This is the hmiting probabihty that the infinite path traced out in the 
infinite extension of Figure 1 will pass through [ri, r2, • • • , r^] during the evolution of a 
random graph or multigraph on a large number of vertices. 

A random graph almost always acquires nearly |n edges before taking the first step 
from [0] to [1] in Figure 1. Indeed, the uniform estimate (13.17), with n = — n^/^^, implies 
that the probability of excess r when m = |n exp(— n~^/^) is of order nT'^l'^ . 

The fractions shown on arcs leading between configurations are transition probabili- 
ties, namely the limiting probabilities that a random graph of configuration [ri, r2, . . . , r^] 
will go to another specified configuration when its excess next changes. For example, 
a random graph in configuration [2], having two bicyclic components and no other com- 
plex components, will proceed next to configuration [1,1] with probability These 
transition probabilities have a fairly simple characterization: 

Theorem 9. Let t\ + 2r2 + h qVq — r and 5i + 262 + S5z + • • • = 1. The asymptotic 

probability that a random graph or multigraph of conGguration [ri,r2, ■ ■ ■ ,rq], having no 
acyclic components, will change to conGguration [ri + Si,r2 + S2, ■ ■ ■ ,rq + Sq, , . . . ] when 
a random edge is added, can be computed as follows: 

Nonzero 5's Probability 

^1 = 1 !/(3r+i)(3r+f) 
Sj = -1, Sj+i = l 9i(j + l)r,/(3r + i)(3r+f) 

6j = -2, 52j+i = 1 9j\{rj - l)/(3r + i)(3r + |) 
5j = -1, 5k = -1, 5j+k+i = 1, 3<k 18jkrjrk/{Sr + |)(3r + |) 

In all other cases, the probability is 0. The estimates are correct to within 0{n~^/'^) when 
there are n vertices. 

Proof. As usual, it is easiest to consider first the uniform multigraph process. We know 
that the generating function for the cyclic multigraphs under consideration is 

5W=eVW^^...^; (16.1) 

ri! r2i rql 

the number of such multigraphs, weighted as usual by the compensation factor (1.1), 
is [z""] S{z). We also know from (3.4) that V{z) = -\ ln(l - T{z)), hence 

1 



(1 - T{z)) 



1/2 
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We observed in section 4 that the operator d = z-j- corresponds to "marking" or 



dz 

}2, 



singhng out a particular vertex. The function S{z) can therefore be regarded as the 
generating function for multigraphs of configuration [ri, r2, . . . , r^] together with an ordered 
pair of marked vertices {x,y). When S{z) is a product A{z)B{z), the famihar relation 

^^{A{z)B{z)) = {^^A{z))B{z) +2{^A{z)){^B{z)) + A{z){^^B{z)) (16.2) 

has a natural combinatorial interpretation: The product A{z)B{z) stands for ordered pairs 
of graphs, generated respectively by A{z) and B{z), with no edges between them; the first 
term {^'^A{z))B{z) of (16.2) corresponds to cases when both of the marked vertices {x,y) 
are in the graph generated by A{z); the last term corresponds to cases when both x and y 
belong to the B{z) graph. The middle term 2{^A{z)){^B{z)) corresponds to the cases 
where a; is in ^ and y is in B or vice versa. 

We can use this idea in connection with (16.1) to understand what happens when the 
graph gains a new edge. The coefficient of z"^ in ■d'^S{z) represents all possibilities {x,y); 
we can divide this into cases by writing 

^S{z)-S{z)^l^ -^+2 JJikYlM ^^^-^^ 

^o<j<q ■' o<j<k<q •'^^ ' -"^^ > y 

where /o(;z) = e^(^) and jj{z) = CjizY^/rjl for j > 1. A term like S{z){'&^fj{z))/fj{z), 
say, then corresponds to cases where x and y both belong to {j + l)-cyclic components. 

Each of the factors fj{z) is a linear combination of powers of the quantity C = 1 + C = 
1/(1 - T{z)). For example, fo{z) = and h{z) = - + T2^, according to (3.4) 
and (11.3). Hence it is easy to compute -dfj and "d^fj, using rule (4.5): 

^(D = «r+' - «r+' ; 

T?2(^a) ^ ^ 2)^^*+^ - a(2a + 3)^"+^ + a(a + 1)^^"+^ . (16.4) 

The overall function S{z) has the form ^3r+i/2p|-^-i^ some polynomial P, with P(0) ^ 
0; hence the coefficient [z""] S{z) is tn(3r + |)P(0)(l + 0{n-^/^))/n\ by (3.8) and (3.9). 
It follows from (16.4) that i3'^S{z) = C^''+^/^Q(C"^) for some polynomial Q, where Q(0) = 
(3r + i)(3r + f )P(0). Hence 



The transition probabilities we wish to compute are the fractions of (3r + |)(3r + |) that 
occur when ■3'^ operates on individual factors of S{z). 
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For example, consider first the term S{z) {^'^ fo{z)) / fo{z) of (16.3). This corresponds 
to the case where both x and y belong to a cyclic component (possibly the same one), 
thereby creating a new bicyclic component; thus it corresponds to having Si = 1 and all 
other Sj = 0. In this case [z""] S{z){i}'^ fo{z)) / fo{z) ~ f ■ f tn(3r+ |)P(0)/n!, and the 
latter is asymptotically |/(3r + i)(3r + |) of the total [2"] '&^S{z). 

The term 2S{z)('&fo{z)){TL}fj{z))/fo{z)fj{z), similarly, gives the probability that a 
vertex from a cyclic component joins with a (j + l)-cyclic component; this occurs with 
probability 2(i)(3j rj)/(3r + |)(3r + |). The net effect on components corresponds to 
Sj = -1, Sj+i = +1. 

There is also another way to get Sj = —1 and Sj+i = +1, namely if both x and y 
belong to the same (j + l)-cyclic component. The probability of this case works out to 
be {3j){3j + 2)rj/ (3r + |)(3r + |); hence the total transition probability for Sj — —1 and 
Sj+i = +1 is 9j{j + l)rj/(3r + |)(3r + |) as stated in the theorem. 

Notice that 

^'c;^ = r,c;^-\^'Cj) + Tjivj - i)c;^-\^Cjr . (le.e) 

We have just taken care of the first term; the second term corresponds to vertices x and y 
in distinct C^'s, when the new edge makes Sj = —2 and S2j+i = +1. The probability is 
9A.(^.-l)/(3r+i)(3r+f). 

Finally, the term 2S{z){^fj{z)){'dfk{z))/fj{z)fk{z) of (16.3) represents a case that 
occurs with probability 2(3j r-,)(3/crfc)/(3r+ |)(3r+ |) and corresponds to 5^ = 5fc = — 1, 
Sj+k+i = +1- 

If we are working with the graph process instead of the multigraph process, we must 
use Cj{z) instead of Cj{z); but fo{z) is still essentially of degree —1/2 in and fj{z) is 
still of degree — 3j, so the asymptotic calculations work out as before. 

However, in a random graph we must use the operator ^{'dl — 1}^) — instead of t?^, 
and we must work with bivariate generating functions, as discussed in section 6. The bgf 
corresponding to (16.1) is almost univariate, however: 



F{w, z) = e 



ri! r^! 



It is not difficult to see that the effect of "&1 swamps the effects of and 'd.y^ , asymptotically, 
so the multigraph analysis carries through. □ 

One amusing consequence of Theorem 9 is that we can use it to discover and prove 
formula (7.2) for the numbers in a completely different way. The probability of reaching 
the configuration [r], consisting of r bicyclic components and none of higher cyclic order. 
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is c\/{r\ Cr). The only way to reach this configuration, when r > 0, is from [r — 1], and the 
transition probabihty is 



5 

4 



(3r-|)(3r-i) 



,r-l 



/((r-l)!e,_0 ■ 



Since ci = 5/24, we have Sr — (6r— 5)(6r — l)er_i/24r, and (7.2) follows by induction. This 
indirect method is probably the simplest way to deduce the fact that Wright's constant 
is l/(27r). 

17. A near-Markov process. We proved in Theorem 9 that the transition probabilities 
shown in Figure 1 are the limiting probabilities, averaged over all multigraphs, that a 
multigraph reaching a particular state will take a particular step as its excess increases. 
But we did not prove that those transition probabilities are independent of past history. 
For all we know, the path taken to a particular configuration during the evolution of a 
random graph might strongly infiuence the probability distribution of its next leap forward. 
The next theorem addresses this question. 

Theorem 10. For any fixed R, an evolving random graph or multigraph almost surely 
carries out a random walk in the first R levels of the partial ordering shown in Figure 1, 
with transition probabilities that approach the limiting values derived in Theorem 9. 

Proof As in previous proofs, it suffices to consider random multigraphs. We will show 
that the transition probabilities have the asymptotic behavior of Theorem 9 for all random 
multigraphs that remain clean — i.e., for all multigraphs that reduce, under the pruning 
and cancelling algorithms of Section 9, to 3- regular multigraphs M having 2r vertices and 
3r edges, when the excess is r < R. We know from Theorem 7 that the multigraph will be 
clean with probability 1 — 0((l + //^)n~-^/^); and we know from (13.17) that the probability 
of excess r becomes superpolynomially small as the number of edges passes ^. So the excess 
almost surely increases past any given value before a large multigraph becomes unclean. 
For example, if — > cxo with n = o(n^/^^), the probability of excess < R approaches zero 
while the probability of remaining clean is 1 — o(l). 

The proof for clean multigraphs is not as trivial as might be expected: Multigraphs 
that follow a given path to [ri, r2, . . . , r^] in the partial ordering are not uniformly dis- 
tributed, among all multigraphs whose complex parts are enumerated by the generating 
function 



assumed in the proof of Theorem 9. Past history does affect the frequency of certain types 
of components. For example, a tricyclic component that prunes and cancels to 3 cannot 



e^(C[Vri!)(C-/r2!)...(qVr.!) 
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evolve along the path [1] — > [2] — > [0,0, 1]; removing any edge of K^^^ leaves a connected 
graph. 

Let's try to clarify the situation by working an example. Consider the reduced multi- 

suppose we wish to compute the transition probabilities for multigraphs of excess 3 that 
prune and cancel to (17.1) after following the path [1] — > [0, 1] — > [0,0, 1]. The generating 
function for all such multigraphs, assuming that there are no acyclic components, would 
be ^ e^T7(l - T)9, if we did not specify the past history [1] ^ [0, 1] [0, 0, 1]; but it 
turns out to be only | as much when we prescribe the history. The reason is, intuitively, 
that (17.1) has 9 edges, and a multigraph with history [1] [0,1] [0,0,1] can reduce 
to it only if the "middle" edge is not the last to be completed. The latter event happens 

Q 

with probability g. 

A formal proof of the | phenomenon can be given as follows. The generating function 
e^r^(l — T)^ expands to e^T^ n2 n9>o -^"^-^"^^ • • - T^^] the individual terms repre- 
sent the insertion of (ni, . . . , ng) vertices into the nine edges of (17.1), after which a tree 
is sprouted at each vertex. The resulting multigraph will have n vertices and m = n + 3 

edges; there will be 6 + ni + h ng root vertices and 9 + ni + h ng "critical" edges 

on paths between root vertices. Suppose we color each critical edge with one of 9 colors, 
corresponding to the original edge of (17.1) from which it was subdivided. Then among 
the m! permutations of edges that could generate any such multigraph, exactly | have 
the property that the last critical edge has some color besides the "middle" color. (This 
follows by symmetry between ni, n2, . . . , ng.) Such permutations are precisely those for 
which the history will be [1] [0,1] — > [0,0,1]; hence we obtain (17.1) with exactly | 
times its overall probability, given that history. 

It turns out that there are 17 unlabeled clean, connected, reduced multigraphs of 
excess 3; and exactly 6 of them occur with weight | when the past history is [1] [0, 1] — 
[0, 0, 1]. Those 6 occur with weight | when the past history is [1] — >• [2] — >• [0, 0, 1], and the 
other 11 do not occur at all in that case. 

In general, given any M that can arise for a given past history, there will be a fraction 
/3 > such that each multigraph reducing to M arises j3 times as often with the given 
history as it does overall. The reason is a slight generalization of the method by which 
we proved the | phenomenon: Each permutation of colors of critical edges is equally 
likely to be the sequence of last appearances in a random permutation of ni + n2 + • • • + 
n^r critical edges, and such permutations determine the past history. The generating 
function for M will then be a constant multiple of (T^^i /(I -T)3^i) {T'^''^ / {l-Tf"-^) . . . 
(T^^«/(1 — T)^^«) . Hence the asymptotic transition probabilities will be the same for every 
feasible M, exactly as calculated in Theorem 9. □ 
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18. An emerging giant. The classic papers of Erdos and Renyi [12, 13] tell us that 
an evolving graph almost surely develops a single giant component, which eventually is 
surrounded by only a few trees and later by only isolated vertices, until the entire graph 
becomes connected. Thus there will be a time when the graph reaches some configuration 
[0,0,..., 0, 1] on the top line of Figure 1 and stays on that top line ever afterward. 

Indeed, the most probable path in Figure 1 is the one that goes directly from [1] to 
[0, 1] to [0, 0, 1] and so on, never leaving the top line. The first transition probability is 



the next is and subsequent steps are ever more likely to stay in line. In such cases we 
can see the "seed" around which the giant component is forming, before that component 
has become in any way gigantic. (The complex components of any given finite excess 
almost always have only 0(n^/"^) vertices, a vanishingly small percentage of the total; each 
step at the beginning of Figure 1 occurs after adding about n^/^ more edges.) 

If we assume that the transition probabilities in Figure 1 are exact, the overall proba- 
bility that an evolving graph adheres strictly to the top line — never having more than one 
complex component throughout its entire evolution — is 

(r + i)(r+|) r(l)r(2) 36 1,6^ {ej 18 ^ ' 

Numerically, this comes to 0.8726646, roughly 7 times out of every 8. 

Is the true limiting probability that an evolving graph or multigraph never acquires 
two simultaneous components of positive excess, throughout its evolution? We can at least 
prove that ^ is an upper bound. For we know from Theorem 10 that an evolving graph 
will hug the top line of Figure 1 for at least R steps with probability 

for any fixed R, as n ^ oo. 

It is natural to conjecture that is also a lower bound, because a large component 
tends to propagate itself as soon as it becomes large enough. Still, it is conceivable that 
a random graph might have a tendency to leave the top line briefly when it first becomes 
unclean. The transition probability for remaining on the top line becomes strictly less than 
r(r + l)/(r+|)(r+|) when the graph has a positive deficiency. For example, suppose the 
initial bicyclic component is already unclean; it will then correspond to the double self-loop 
of (9.15). We know from (13.16) that this case arises with probability 0(n~^/^). But if 
it does occur, the generating function for the complex part will be a constant multiple of 
T/ (1— T)^ instead of T^/ (1— T)'^, so the proof technique of Theorem 9 will yield a transition 
probability from [1] to [0, 1] of only | instead of y|. In general, when the deficiency is d, 
the asymptotic transition probability drops to 

(r-f)(r-| + l)/(r-f + |)(r-f + |). 
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This probability estimate is, moreover, valid only when the excess is reasonably small as a 
function of n; otherwise the trees that sprout from the pruned multigraph M will not be 
large enough to assert their asymptotic behavior. 

19. A monotonicity property. During the time when an evolving graph or multigraph 
stays clean, we can show that the asymptotic top-line transition probabilities r(r + 1)/ 
(r + i)(r + |) are in fact lower bounds for the correct (non-asymptotic) probabilities. 
More precisely, the proof of Theorem 9 shows that the true transition probability is a ratio 
of expressions involving the tree polynomials tn{y), when there are n vertices in the cyclic 
part of the multigraph. We will prove that this ratio decreases monotonically to r(r -|- 1)/ 
(r +!)(''+ |) as n ^ oo. 

First we need to prove an auxiliary result about tree polynomials that is interesting 
in its own right. Let us generalize the definition of tn{y) in (3.8) by introducing a new 
parameter m > 0: 

- — L^^^y^t (^) (19.1) 
{i-T{z)y t'o 

Thus 

tm,niy) = T.r)(-^ytr,{y-j) (19.2) 

is the mth backward difference of tn{y)- 

Lemma 6. Let m be a nonnegative integer. For any fixed integer n > m and arbitrary 
real y > 0, the ratio tm,n+iiy)/tm,n{y) is an increasing function of y. Equivalently, for 
fixed y > and any integer n > m, the ratio t'^ „(l/)/tm,n(l/) is an increasing function of n. 

Proof The two statements of the lemma are clearly equivalent, because tm,n{y) is positive 
when y > and n > m. 

Equation (2.12) of [24] states that 



A;! n'^ 

k>0 



(19.3) 



where means x{x + 1) . . .{x + k — 1) and x- means x{x — 1) . . . {x — k + 1). Therefore, 
by (19.2), 



E„.k+l-m (n — ^^k. 
(k+l)-^ ^ 



= r.-'"i:(fc + m)|^ '" ^ . (19.4) 

A;=0 



77 



It follows that the inequality t'^^„{y) /tmAv) < Cn+i(2/)/^m,n+i(y) is equivalent to 



Efc=o QfcQife ^ Efc=o Qfc/^fc ..Q^x 

T^iv — 7 — > , ^ ' ^^y-^) 

where N — n + 1 — m and 

afe = (A; + m) |j , 6^ = (fc + m) ^ |j , 

= (n - l)fc±m^^n-m-fe ^ ^ n^±^ (n + l)n+l-m-fc _ ^^g_g) 

The following condition is sufficient to prove (19.5), assuming positive denominators: 

ao ai ttN , tto ai ajv „x 

— > — >•••>— and 7^>75->--->75-- (19-7) 

bo Oi Ojv Po Pi Pn 

For we have 

TV JV N N 

= 5Z ^^'^^^ ~ hjak){(3k(y-j - (3jak) > . (19.8) 

0<j<k<n 

(Historical note: Inequality (19.5) under condition (19.7) goes back at least to Seitz in 
1936 [33]; see [29, Section 2.5, Theorem 4], where a supplementary condition is needed: 
The product of the denominators must be positive. In linearly ordered discrete probabil- 
ity space, the inequality is equivalent to saying that E(^f{X)g(X)) > E(^f{X))E(^g{X)) 
whenever / and g are increasing functions of the random variable X. This inequality is, in 
turn, a special case of the celebrated FKG inequality [15], which applies to certain partially 
ordered probability spaces. The equality in (19.8), which reduces to Lagrange's identity 
when we set Ofc = ak and bk = Pk, is the Binet-Cauchy identity for det AB when ^ is a 
matrix of size 2 x n and 5 is n x 2.) 

And (19.7) is not difficult to verify, under the substitutions (19.6). We have 

bk+i 11 1 6i. 1 
+ + r + • • • + r = — + 



Ofc+i y y + 1 y + k ak y + k' 

ak+i n — k — m ^ n — k — m + 1 f^k+i 



ak n n + 1 (3, 

(When m = we omit the terms for A; = 0.) □ 
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Assume now that the cychc part of a random multigraph contains n vertices. The 
"top hne" transition probabihty from a single clean component of excess r to a single 
component of excess r + 1 is 1 — Pnr, where is the probability that a new bicyclic 
component will be formed. By the argument of Theorem 9, 

1/2 

where V{z) = l/(l — T{z)) is the generating function for unicyclic components and 
S{z) = T{zY'^ / (l — T{z))^'^ is a prototypical generating function for clean components of 
excess r. We want to show that 

Pnr is an increasing function of tt., since we want 1 — Pnr 

to be decreasing. 

Let's work on a simpler problem first, showing that 

Qnr = , ' \ 19.10 

^^'^ [z^]^{A{z)S{z)) ^ ' 

is an increasing function of n whenever 

A{z)= ^ h>-a. (19.11) 

{i-nz)y 2 

Here a is a nonnegative integer; we will assume that n > 2r + a, so that the denominator 
of (19.10) is nonzero. We have 

^A{z)= {b-a)T{zY 



^{A{z)S{z)) 



{l-T{z)r' {l-T{z)y^'^ 
(3r + 6)T(z)2^+'* (r + 6 - a)T(2)2'^+« 



hence 



{l-Tiz))'"^^"^' {l-T{z)y 

bt2r+a,ni^r + b + 2) - (b - a) t2r+a,n(3r + b + 1) 
(3r + b) t2r+a,n(3r + b + 2) - (r + b - a) t2r+a,n(3r + b+l) 

b r{2h -3a)/{3rb + b-) 



3r + b\ ( t2r+a,n(3r + 6 + 2) r + b-a 



.^2r+o,n(3r + 6+ 1) 3r + 6 
Since the coefficients of t2r+a,n(2/) ^^'^ nonnegative, we have 

t2r+a,n(3r + b + 2)/t2r+a,n(3r + 6 + 1) > 1 > (r + 6 - a)/(3r + b) 
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It follows that Qnr is increasing iff 

^2r+a,n(3r + 6 + 2) ^ t2r+a,n+l(3r + 6 + 2) 
^2r+a,n(3r + 6+ 1) t2r +a,n+l 

(3r + 6 + 1) ■ ' ■ ' 

And (19.12) does hold, because t2r+a,n+iiy) /t2r+a,niy) increasing function of y by 

Lemma 6. 

Incidentally, this argument also shows that qnr is constant when b = -^a and decreasing 
when < 6 < -^a. 

Now to prove that pnr is increasing, we can write 

[z"] {^^V{z))S{z) [z'']^{{W{z))S{z)) 



Pnr — 



[z'']^{{W{z))S{z)) [z'']^^{V{z)S{z)) 

[^n] (^^Y^^^^g^^^ [^n] (^^V{z))S{z) 



[z^]^{{W{z))S{z)) [z^]^{V{z)S{z)) ' 

The first factor is of type Qnr if we put A{z) = i}V{z) = ^ T{z)/ (l — T{z)Y^'^] here a = 1, 
6 = ^, so Qnr is increasing. The second factor is of type Qnr if we put A(z) — V{z); here 
a = 0, b — ^, and again q^^ is increasing. We have proved 

Theorem 11. The probability that a clean random multigraph of excess r > will not 
acquire a new bicyclic component when its excess next changes is strictly greater than the 
limiting value r(r + l)/(r+|)(r + |). □ 

Theorem 11 gives further support to the || conjecture of Section 18, because || 
was shown there to be an upper bound. If the top-line transition probability were always 
strictly greater than r(r + l)/(?' + |)(?' + |), we could establish as a lower bound. 
However, Theorem 11 does not prove the conjecture, because the probability becomes 
smaller than r(r + l)/(r + |)(r + |) when a graph becomes unclean. 

Incidentally, when the number of edges gets large, we may need asymptotic formulas 
for tn{y) that are valid when y goes to infinity with n. Formula (3.9) can be extended to 

uniformly for 1 < y < n^/"^, using the proof technique of Lemma 3. Still larger values of y 
can be handled by using the saddle point method to derive the following general estimate: 

I np (aA-l)n\(l-6)/2 

taXnA^ri + 6) = • 20F^(i-p)An + ^(^) + ^(I/x/A^)) , (19.14) 
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for fixed a and 6 as A — > and An/ (fog n)^ 



oo, where 



p = 1 + cA - VA(1 + c^A) = 1 - VA + cA - ^ A^/2 + 0{X^/^) , c 



1-a 



(19.15) 



2 



For example, to estimate t2r,n(3r) when r = n^/'^, we can use (19.14) with a 
and A = 3n~^/^. The complicated dependence on p can also be expressed as 




0, 



gnp „(aA-l)n 



exp(n(l - |Aln A + iA + (| - a)X^/^ 



yX^ + 0{X'/^))), (19.16) 



(1-p) 



which is sufficiently accurate if A < n 



1/4 



20. The evolution of uncleanness. We get further insight into the behavior of an 
evolving multigraph by studying how its reduced multigraph M changes as the excess 
increases. Let's review the theory of Section 9 in light of what we have learned since 
then. The generating function for the cyclic part of all multigraphs having excess r and 
deficiency d is 

Era{z) = era .\\3.-.-n/2 " (20-1) 

We can interpret it as follows, ignoring the constant factor e^d for a moment: There is a 
reduced multigraph M having v = 2r — d vertices and /i = 3r — d edges; each vertex has 
degree > 3, where a self-loop is considered to add 2 to the degree. We can obtain all cyclic 
multigraphs M that reduce to M by a two-step process. First we insert or more vertices 
of degree 2 on each edge; and we also construct any desired number of cycles, as separate 
components. All of the newly constructed vertices, including the vertices in the cycles, 
have degree 2. This first step creates a set of multigraphs with the univariate generating 
function z'^{l — z)~^{l — z)"-*^/^, because each edge subdivision corresponds to (1 — z)~^, 
and because the cycles are generated by exp{^z + jz'^ + ^z^ + ■ ■ ■) = {1 — z)~^/'^. Now 
we proceed to step two, which sprouts a rooted tree from every vertex; this changes z to 
T{z) in the generating function. 

The excess increases by 1 when we add a new edge {x,y) to M. How does the new 
edge change M? A moment's thought shows that M will gain 2, 1, or vertices; this 
means the deficiency will either stay the same or it will increase by 1 or 2. 

In fact there is a nice algebraic and quantitative way to understand what happens, in 
terms of the generating function. Again we consider a two-step process: First we choose 
a vertex a; of M; this means we apply the marking operator to the generating function. 
There are three cases: The marked vertex either belongs to a tree attached to one of the 
u special vertices of M, or it belongs to a tree attached to a vertex within one of the 
IJ, edges, or it belongs to a tree attached to a vertex in some cycle. We represent Case 1 



81 



by attaching a "half-edge" to the existing vertex; we represent Case 2 by introducing a 
new vertex into the spht edge and attaching a half-edge to it; we represent Case 3 by 
introducing a new vertex with a self-loop and attaching a half-edge to it. 

A half-edge is like an edge but it touches only one vertex. For example, if M is the 
multigraph ^4, the symbolic representations of the three possible outcomes of step 1 are 




Case 1 



Case 2 



Case 3 



Let's call this augmented multigraph M'. 

A cyclic multigraph M' with a marked vertex can be reduced by attaching a half-edge 
to the marked vertex, then pruning all vertices of degree 1 and cancelling all vertices of 
degree 2. Conversely, the marked cyclic multigraphs that reduce to a given M' are obtained 
by adding zero or more vertices to each edge (including the half edge), also adding cycles, 
then sprouting trees from each vertex. Thus the generating function for M' in Case 1 is 



vT{zY 



/i-h3/2 ' 



(20.2) 



the V in the numerator accounts for the number of vertices that can be chosen, and the 
extra (l — T{z)) in the denominator accounts for the new half-edge. The generating 
function for M' in Case 2 is 



M+5/2 ' 



(20.3) 



now we have jj. edges that can be split, and we include an additional T{z) in the numerator 
for the new vertex and an additional (l — T{z)) in the denominator for the new half-edge 
and the additional split edge. Finally, the generating function for M' in Case 3 is 



^rd 



\T{z) 



1^+1 



n+5/2 ' 



(20.4) 



as in Case 2, the diagram has gained one vertex and two edges. The factor | is due to the 
compensation factor k of a self-loop. 

If our calculations are correct, the sum (20.2) -|- (20.3) -|- (20.4) should be the result of 
applying to the overall generating function (20.1). And sure enough. 



(1 - Tiz)) 



/u+1/2 



{l-T{z)) 



/u+3/2 



+ 



{^^+^)T{zr+^ 

{l-T{z)Y^'/'' 



(20.5) 
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everything checks out fine. 

The next step, choosing y, is the same, except that now we mark a vertex of M' and 
obtain M" . The transition from M' to M" again leads to three cases; we attach another 
half-edge and possibly split an existing edge or add a new self- loop. In particular, we might 
split the half-edge of M' . The change in the generating function is once again represented 
by (20.5), but this time u and /i have to be adjusted to equal the number of vertices and 
edges of M' . The left term of (20.5) therefore becomes 

and the right term becomes 

{^^+'^){u+l)T{zr+^ (^+i)(^+|)r(^)^+2 

{1-T{z)f+'/' {1-Tiz)f+'/' ■ ^ • ^ 

Notice that the first term of (20.5) corresponds to the case that the deficiency increases 
by 1 when x is chosen, while the second term corresponds to the case where the deficiency 
stays the same. Similarly, the first terms of (20.6) and (20.7) correspond to an increase in 
deficiency when y is chosen, after x has already been marked. 

By looking at the coefficients of these generating functions we can see why the defi- 
ciency rarely increases unless the total number of vertices in the cyclic part is not much 
larger than v. Suppose we change the generating function to 

{l-sT{z)f^ ^ 

then 

[z^]F{z,s)\s=i 

will be the average number of tree-root vertices that appear within the edges of M. For 
fixed u and /x as n — > cxd this number is 

lz-](^+^)nzr+^l-Tiz))-^'-"' fa+l)f„(;.+ |) , 



which is approximately by (3.9), when fj, is large. Thus, there are about ^/ fin tree 

roots, only u of which will increase the deficiency when chosen; almost all choices of x 
and y will fall in trees that add new vertices to M and M'. 
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If we replace one of the factors T{z) in the numerator of the generating function by 
^T{z) = T{z)/{1 - T{z)), we multiply the coefficient of z"" by the average size of a rooted 
tree; we find that each rooted tree contains about yjnj vertices. 

The number n in these calculations has been the number of vertices in the cyclic part 
of a multigraph, and the number is 3r. Let's return to our other notational convention, 
where n is the total number of vertices in the evolving multigraph and m = ^(1 + 
is the total number of edges. Recall that the average excess r grows as |//^, for \i < n^/^^; 
the size of the cyclic part, similarly, has order ^'n?/^. The probability that a random 
new edge falls in the cyclic part (and therefore increases the excess) is therefore of order 
(/Un^/"^/n)^ = we must add about more edges before the excess increases. 

And when it does, the probability of choosing a "bad" x or y, making the new multigraph 
unclean, is the ratio of 2r to the total number of tree roots, which is of order 

V3r(//n2/3) ./J^iM^ 

We will probably have to do m}^^/ ji augmentations of excess, adding [m?/^/ ji'^){in}^^/ ji) = 
n/jJL^ more edges, before we reach an unclean multigraph. That is why the multigraph 
tends to stay clean until ji = n^/^^, as asserted in Theorem 7. 

After X and y are chosen to form the endpoints of a new edge, a third step takes 
place: This new edge is merged or integrated with the other edges. Symbolically, the two 
half-edges for x and y are now spliced together. We can complete our study of how the 
generating function changes at the time of excess augmentation by considering this third 
and final step. 

It is easiest to consider the inverse of the final step, namely the operation of marking 
an edge whose removal would decrease the excess. Such an edge must be in the complex 
part, not the acyclic or unicyclic part. The operator that corresponds to marking an 
arbitrary edge in a complex multigraph of excess r is r + i?, because this multiplies the 
coefficient of z"^ by r+n, the total number of edges. However, we also need to figure out the 
generating function for "insignificant" edges, edges whose removal would leave the excess 
unchanged. Such edges can be described by an ordered pair consisting of a rooted tree 
and a multigraph of excess r with a marked vertex; one end of the edge is attached to the 
marked vertex and the other end is attached to the root of the tree. Thus the appropriate 
operator for insignificant edges is T{z)'d. Altogether we find that the generating function 
that corresponds to marking a significant edge, given a family of complex multigraphs 
of excess r, is r + d — Td. We also should multiply this by two, because we assign an 
orientation to the edge with the ordered pair (x, y) . 

When the operator 2(r + — T{z)'d) is applied to a generating function of the form 
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T{zY/{l - T{z)Y, with = 1/ + r, we get 

2fr + (1 - Tizm - ^i^'+')T{zr) + 



(20.9) 



Therefore the inverse operation we seek, which merges an ordered {x,y) into the set of 
existing edges, takes 

T{zY , , 1 T{zY 



{\-T{z)Y^^ {l-T{z)Y' 
For example, the first term of (20.6) will go into 

v'^T{zY 



(20.10) 



2{^l+l){l-T{z)) 



M+3/2 



1/2 

(First we multiply by (l — T{z)) to get rid of the unicyclic components, then we apply 
the inverse operation (20.10), then we put the unicyclic components back.) 

Altogether we find that the generating function T{zY'^~'^ / {l — T{z))^^ d+i/2 cyclic 
multigraphs of excess r and deficiency d makes the following contributions to the generating 
functions for cyclic multigraphs of excess r + 1 and deficiencies d, d+l, and (i+2, according 
to (20.6), (20.7), and (20.10): 

(6r-2rf + 5)(6r-2rf+l) T(^)2'^+2-^ 



+ 



8(3r-d + 3) [l-T{z)f''^^~'^^^''^ 
((2r - d) (6r - 2d + 3) + (2r - d + 1) (6r - 2d + 1)) T(z)2'"+i-'^ 



4(3r - d + 2) T(^)) 



+ ^^^-d? nzr-- 

2(3r-d+l) j^(^)^3r+l-d+l/2 ■ ^ ) 

This is essentially the same as the recurrence relation for e^d in (5.11)-(5.13). 

We can illustrate the observations of this section by introducing another partial order- 
ing analogous to Figure 1. Every evolving graph or multigraph traces a path in Figure 2, 
just as it does in Figure 1; but in Figure 2 the state (r, d) represents excess r and defi- 
ciency d. Fractions in brackets above each state are the coefficients e^-d of the generating 
function (5.10). Fractions on the arrows are not transition probabilities but rather the 
amounts by which each generating fimction coefficient affects the coefficients at the next 
level; these fractions are the coefficients in (20.11). 
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(3,0) 




Figure 2. The evolution of deficiency. Each configuration (r, d) stands 
for a graph or multigraph whose complex part reduces to a multigraph 
with 2r — d vertices and Sr — d edges, when vertices of degrees 1 and 2 are 
eliminated. A graph or multigraph with deficiency is called "clean" ; 
the reduced multigraphs in such cases are 3-regular. When r is small, 
each unit increase in deficiency occurs with probability of order 
therefore most random graphs stay clean until r is quite large. 

21. Waiting for uncleanness. We have seen that a graph almost surely stays clean 
while it has ^{n + jju"^/^) edges, as long as ji is o(?i^/^^). What happens when ju gets 
a bit larger? Another contour integral provides the answer; in this one, we rescale // in 
preparation for the appearance of the giant component, but we allow fi to be small enough 
that there is a substantial overlap with the estimate (10.1) of Lemma 3. 

Lemma 7. If m = |(n + /in) and r = ^l^^n + p^/ /i^n, we have 

= 5(y,/.,p,n)exp(0((l + \pf)fi-'/'n-'/' + (1 + \p\)^^'/'n'/')) , (21.1) 
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where 



S(y,/x,p,n) = -y/^//-i-^exp('-^/n-^p') , (21-2) 



uniformly for n logn < /i < n , \p\ < and fixed y as n — > oo. 

Proof This is the sort of lemma for which computer algebra really pays off. We can begin 
by using Stirling's approximation to show that 

lo ( 2"^m!n!e^ ^ - -n 3rln % 

-|ln//+iln| + f/i'^n + fp^ 

+ 0((1 + |pp)//-3/2n-i/2 + (1 + |p|)//^/V/2) . (21.3) 

Now we express the remaining factor by using the trick of (10.11): 

(2^(.))— +-T(.)^- - J- /(I - .)-^e^« ^ (21 4) 

^ J (1-T(.))^^+^ -2.if^' ' . ' ^^'-"^ 

where 

5f(2) = nz + (3r — m) In^ — 3r ln(l — 2) + (n — m + r) ln(2 — z) . (21-5) 

As before we can show that the asymptotic value of the integral depends only on the 
behavior of the integrand near z = 1. This time we need not worry about a three-legged 
saddle point, because we are sufficiently far from the critical region near fi = 0. A good path 
of integration turns out to be 2 = l — a + itfi~^^'^n~^^'^ , where a = |//^-|-|p//~^/^n~''^/^. 
Indeed, some beautiful cancellation occurs in the most significant terms: 

+ 0{{{l + \p\)„-'/'n-'/'+„)f), (21.6) 

when \t\ < logn. The O bounds follow from the fact that the power series for log 2;, 
log(l — z), and log(2 — z) converge in the stated ranges. 
The other factors of the integrand, besides e^^^\ are 

where (3 = {a — %tpr^l'^n~^l'^)l p, = 1 — |// + (|p — it)p~^/'^n~^/'^. We can now write the 
integral as a factor independent of t times 

J e-5*'/2(i + + 72^2 + ---)dt, (21.7) 
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where the 7's are functions of and p, and the series is convergent for \t\ < logn. The 
integrand is superpolynomially small when t = ±logn; hence we can bound the error 
terms for \t\ < logn, then integrate from —00 to 00, showing that (21.7) is 

^ (1 + 0(p^/ + (1 + p2)^-3/2^-i/2)) . (21.8) 

Finally we observe that the other factors nicely cancel the leading terms of (21.3); only 
(21.1) and (21.2) are left. The overall formula (21.1) has a weaker estimate than (21.8) 
because Stirling's approximation (21.3) is more sensitive to the value of p and because of 
the term ^(1 — a). □ 

Notice that Lemma 7 matches the first estimate of Lemma 5, which says that the 
asymptotic probability of excess r is like that for a normal distribution with mean |//^ 
and variance of order fx^, as long as r = 0{ii^). On the other hand, the extreme tails for 
larger values of r are not as small as they would be in a normal distribution; they decrease 
only as shown in the second estimate of Lemma 5. For example, with probability 100~"^ 
all edges will join vertices in the first n/lO vertices; so there will be at least 0.9n isolated 
vertices, and the excess will be at least m — n + 0.9n > 0.4n. 

Theorem 12. The probability that a random multigraph with n vertices and m = \ 
ixn) edges is clean, when < /j, < n~^/^ , is 

exp(-|/n + 0((/i^/V/2logn + /x-3/2n-V2(iogn)3)). (21.9) 

Proof. The probability decreases as p increases. Therefore we need to verify the result 
only for greater than n~^^^^ or so, when the error estimate fi~^/'^n~^^'^{logn)^ does not 
swamp the main term exp(— l/i'^n) — 1 — + 0{iJi^ii?). 

Formula (21.1) is the probability that a random graph or multigraph with m edges is 
clean and has excess r, if we set y = |. That probability is superpolynomially small unless 
IpI < log'T', because of the term —p^ in the exponent. Extremely large values of p, not 
covered by the hypotheses of Lemma 7, are also negligible. Therefore we can sum over r by 
integrating over p from — logn to +logn; and we can then extend the integral from —00 
to 00 without changing its asymptotic value. Hence the probability of cleanliness is 

n-i/2yiIp-3/2exp(-|p4„) p e-'^''/'^^dp = exp(-|p4n) , 
plus the error term. Another nice bit of cancellation. □ 

Corollary. The average number of edges added to an evolving multigraph until it first 
becomes unclean is 

\^ + + 0(n«/ii+^) , (21.10) 

and the standard deviation is of order v?l^. 
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Proof. The stated average number is ^^yoPrnj where Pm is the probabihty in the theorem. 
When < 0, the probabihty of uncleanhness is 0(n~^/^) by Theorem 7, so the sum 
for < m < |n is - 0(n^/^). When < < n~^/^^(logn)^/^^, the probabihty 
of uncleanhness is 0(n~^/^^(logn)^'*/-'^-'^) by (21.9); after that the error is neghgible in 
comparison with the integral 

foo poo 

Jo Jo 
where c is the coefficient of n"^/^ in (21.10). This proves (21.10). 

The expected value of at the stopping time is ^^>o(2to + l)pm, and we need to 
be especially careful when evaluating this sum; the simple estimate Pm = 1 — 0{n~^/^) 
for m < in will not do, because it will obliterate significant terms by adding 0(n^/^). 
Appropriate accuracy is maintained by computing the expected value of (m — |n)^, which 
is 

— + ^ (2m + 1 - n)pm = XI ~ 2m)(l - pm) + (2m - n)pm + 0{n) . 

m>0 m—0 m=n/2 

We can show that the terms for m < are now negligible, because the cleanliness 
probability pm is bounded below by the probability that a multigraph with m edges has 
excess 0. Therefore 1 — Pm = Oin^jin — 2m)^) when m < mo = |^ ~ n^/^+'^, by the 
remarks preceding (13.23); and 

n/2 rno / 2 \ "'/^ 

X (n-2m)(l-p^) - E <^ ( \^xi ) + E 0(n-2m)=0(n^/^-^)+0(n^/3+2^). 

m=0 va=0 ^ ^ ' ^ m=mo 

The other terms can be approximated by 

oo 2 

V (2m - n)pm = / !L^e-(2/3)^'" dfi + 0(ni^/"+^) , 

Tn=n/ 2 

with an error estimate coming from the range < /U < n-3/ii+^ as before. It follows that 
the variance is asymptotic to this integral minus the square of ((21.10) — ^n), namely 
(3V2r(i)2-7/2- c2)n3/2. 

Incidentally, the value of c is approximately 0.50155, and the standard deviation is 
approximately 0.1407n^/^. □ 

Once a graph begins to get dirty, its deficiency rises rapidly. For fixed d we can 
estimate the probability of excess r and deficiency d by taking y = ^ — d and multiplying 
(21.1) by r'^/dl, because of (7.16). The fact that (21.1) has T{z)^'' in the numerator 
instead of T{zf''-'^ is unimportant, since T(^)2^ = T{zf''-'^ J2 © {T{z) - l)^ We obtain 
a probability about {^fx^n)'^/d\ times as large as before, but this is damped rapidly by 
the factor exp(— |//^n) when fi becomes greater than n~^/^. We will look further at the 
growth of deficiency in section 23. 
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22. A closer look. The structure theory of section 20 gives us more detailed information 
about what happens when an evolving multigraph first changes from clean to unclean. We 
learned in that section that the process of adding a new edge {x^y) can be broken into 
three parts, namely the introduction of half-edges at x and y followed by the joining of 
those two edges. The deficiency can increase by 1 during each of the first two stages. 

The probability that a clean graph becomes potentially deficient when a half-edge 
is attached to x is the probability that the image M' of the half-edge after pruning and 
cancellation does not create a new vertex not in M. According to the analysis of section 20, 
the expected number of times this happens is 

P^i^) = E ^^27^ t"''"^'^] ^) ' (22-1) 



r>0 (1 - T{WZ)) 

= ^w^z^ + ^ {2w^ + 13w^) z^ + --- . (22.2) 

The factor 2r covers the deficient choices of x, as in the first term of (20.5). 

Actually (22.2) is an overestimate, because some apparently bad choices of x are "false 
alarms." If the half-edge of x does not add a vertex to M, there's still a possibility that 
y will be chosen in the acyclic part; then the new edge (x, y) will not increase the excess 
and the multigraph will still be clean. The expected number of false alarms is 

;(«) = E^KV1^G.K.). (22.3) 



m 

The multigraph becomes unclean when y is chosen if the half-edge for y prunes and 
cancels to a reduced multigraph M" having the same 2r -|- 1 vertices as M'. This occurs 
with probability 

m 

r>0 (1 - T{WZ)) 

= ^wz+\{2w + 9w^)z^ + ■■■ . (22.5) 
Consequently we must have 

pi (n) - v'l {n) + Vi in) = 1 (22.6) 
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for all n; this identity is a nontrivial property of the bivariate generating functions Gi{w, z) 
and G2{w, z). When n = 6, for example, computer calculations show that 



10288260775 ^ , 38865625 

(6) = 0.4668 ; p\ (6) = fa 0.0635 ; 

^ 22039921152 ' ^ 612220032 

13150822877 

T,2(6) = f=i 0.5967 . 

^ 22039921152 

We can use Lemma 7 to calculate the approximate values of these quantities when n 
is large, ignoring extreme terms not covered by that lemma: 

pi(n)--/ / 2rS(|,//,p,n)(//3/V/2dp)(ind/x) 

^2-7/43-1/4^(3)^1/4. (22.7) 

P» = E ^"^"'^r2^^^'''' [^"-"] T{wz)G,{w, z) 

I b 

m 

2 /'OO poo 

~ - / m-M 2rS(|,/x,p,n)(/x3/V/2dp)(ind/x) 

^ JO ^-oo 

^2-7/43-1/4^(3)^1/4. (22.8) 

-1 poo poo 

P2{n)^- / / {3r+l){2r+l)B{l,^^,p,n)i^^^/'n'/^dp){lnd^^) 

n Jn-^l^ J-oo 

^ (22.9) 



2 

Notice that pi(n) and Pi(n) are unbounded, so they must be regarded as expected values 
(not probabilities). But p\{n) — Pi{n) is the probability of a "true alarm." As we might 
have guessed, the transition from clean to unclean occurs about half the time when x is 
chosen, half the time when y is chosen. 

23. Giant growth. We know from the classical theory [13] that a giant component will 
emerge when the number of edges is ^(l+/i) for a positive constant /x. The classical theory 
deals with graphs, but the same phenomenon will occur with multigraphs, because random 
graphs are generated by the multigraph process if we discard self-loops and duplicate edges; 
discarded edges do not affect the size of components, and comparatively few edges are 
discarded until the graph has gotten rather dense (see [4]). 

Instead of relying on the classical theory, we can also deduce the existence of a giant 
component by studying the generating function G{w,z). The proof is indirect: First we 
count the vertices that lie in trees and unicyclic components, showing that there probably 
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aren't too many of those. Then we show that it is improbable to have two distinct complex 
components. 

The first part is easy, because there is a simple closed form for the expected number 
of vertices in trees. If we mark just the vertices in trees of size k, by differentiating the 
generating function 

G{w, z) exj£>{-k^-^w^-^z^/k\ + k^-^w^-^z^s^/k"^) 

with respect to s and setting s = 1, we see that the expected number of such vertices is 
just 

• [«;-^"] ^w''-^z''G{w,z) = LJIhJllp. [w^-k+^z''-'']Giw,z) 



^2m/j! 2"^-fe+i(m - /c + 1)! (n- /c)! ' 

this can be written 



k\ (n — k)'^'^ ^ \ n 

in terms of falling factorial powers x— = x{x — 1) ... (x — k + 1). 

Asymptotically, we have n— = n^(^l + 0{k'^/n)) and {n — A;)'"' = n^{l -[- 0{k'^/n)) for 
all k; also (1 - k/n)'^ = e~^[l + 0{k'^/n)) for k < ^ and (1 - kjnY < for k < n. If 
is a nonzero constant, /x > —1, and if m = ^(1 + /x), expression (23.1) is 

iTJl^^^^ /i)'=e-'=(^+'^) (1 + 0(^)) (23.2) 

for k < i/n; and it is superpolynomially small when k = ^/n, because it is 0(((1 + 
li)e~'^)''k^/'^) and (1 + iJi)e~^ < 1. It is also superpolynomially small when k > y^, 
because we will prove in section 27 below that a continuous approximation of the quantity 

Im — k In — k 2^ m-n- fn — k^^"^ 
m V n (n — k)"^^ \ n 

decreases when k increases. 

Let a be defined by the formula 

(l + //)e-'^ = (l-(7)e'^, a = ii + 0{ii^). (23.4) 

Then cr is the quantity called 1 — a;(|(l + //)) in [13], and we have 

^ (d + = ((1 _ .Je-e-'))' 

fc>l ■ k>l 

= T((l-(7)e-(i-'^)) = l-(7, 
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^, m — K n — K A-m-n- n — K\ /^o on 



when jj, is positive. By summing (23.2) over all k, we conclude that the expected total 
number of vertices in trees is 

i^n + 0(O; (23.5) 

the error term 0{a~^) here comes from summing i?^T((l — a)e~^^~"'^^ , which brings a 
factor of k'^ into each term. 

For example, if 1 + /x = ln4, we have 1 — cr = ln2, because ^ ln4 = | ln2. When the 
number of edges reaches nln2 the expected number of vertices in trees will be |n. And 
in general when the number of edges reaches ^ In , the expected number of vertices in 
trees will be (1 — x)n, for < a; < 1. 

The expected number of vertices in unicyclic components can be found in a similar 
way, by differentiating 

G{w,z) e-^("'^)+^(^^^) 

with respect to s and setting s = 1. The generating function is 

1 T{wz) 



2 {1-T{wz)y 

and we have 



G{w,z) = {W{wz))G{w,z) , (23.6) 



T{z) ^ ^ k^Qjk) J, 

(l-n-))' ^' 
by (3.12). The expected number of vertices belonging to unicyclic components of size k 
therefore can be expressed in closed form, analogous to (23.1): 

1 k^Q{k) 2^m\n\ , n-ki r^r ^ 

2 ^! ' ']G(«;,^) 

~ 2 k\ {n-kf^ \ n ) ' ^ 

Summing over A;, and breaking the sum into two parts k < y/n and k > \fn as above, now 
yields 

1 ^ ^ ^)e-('+.))''(l + 0(^!)) = ^ + 0(<,-^n--) . (23.9) 

A:>1 

(We will obtain sharper bounds in section 27.) 

We have assumed in this discussion that is a constant. But our relatively coarse 
asymptotic arguments are in fact valid if varies with n, provided that it is not too small. 
Relation (23.4) defines a as an analytic function of //, 

2 2 43 44 4 104 5 40 6 

^ = ^-3^ +9^ -135^ + 405^ -189^ 

7648 7 2848 o 31712 „ 23429344 10 /oo 

H II M H At At H , (23.10) 

42525^ 18225^ 229635^ 189448875^ ^ ' ^ ' 
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where the power series converges for |//| < 1. The quantity ((1 + fi)e~^^) is superpolyno- 
mially small for k = ^/n if fj, is at least, say, n"-*^/^ logn. We are therefore justified in using 
(23.5) +(23.9) as the expected number of vertices in non-complex components whenever 
fj, > n~^/^ logn. 

Suppose /i = n~^/^logn. Then the expected number of vertices in unicyclic com- 
ponents is approximately ~ = |n^/^(logn)~^, and a similar argiiment proves 
that the expected value of the square of this number is approximately ~ |n(log n] . 
So the probability of choosing two vertices in unicyclic components is approximately 
|n~^(logn)~^. This probability decreases steadily as m increases, but even if it stayed 
fixed we would have to add about |n(logn)'^ more edges before hitting two unicyclic ver- 
tices, i.e., before creating a new bicyclic component. By that time the expected number 
of vertices in trees and unicyclic components will be nearly zero, so the multigraph will 
almost surely contain no such vertices. Therefore, if there is only one complex component 
present when /j, = n~^/^logn, there will almost surely be only one complex component 
from that time on; it will become gigantic. (We will obtain sharper results in section 27; 
see Lemma 9 and its corollary.) 

Let's look more closely at what happens as the giant component develops. According 
to (23.5), it will have approximately 

(^1- n= ^^n = 2//n + 0(Ai2n) (23.11) 

vertices when m = ^{1 + fi); this is substantially larger than the number + 0{fi~^) 

of unicyclic vertices. When m increases by 1, the value of /in increases by 2, so (23.11) 
increases by 4. Notice that (23.11) agrees with the leading term of (15.11). 

We saw in section 21 that the expected excess r is approximately |/x^n when m = 
^{1+ fi), at least for < < n"^/^. We will prove momentarily that this relationship 
continues to hold as long as ^ remains o(l); but before giving the proof, let's look at 
the situation heuristically. The probability that a new edge increases the excess is the 
probability that both of its endpoints lie in the cyclic part, namely (2|u)^. The change in r 
with respect to m is {dr / dy){dii / dm) = (2(U^n)(2/n), and this too is (2/x)^. So the relation 
r = |/x^n is consistent with (23.11) when /x is not too large. 

The expected value of the deficiency d turns out to be approximately about 
IJ, times r. Heuristic justification comes from the considerations of section 20: When a 
new edge (x, y) falls in the cyclic part, the probability that x is "bad" (in the sense that it 
increases the deficiency) will be the number of reduced vertices 2r — d divided by the square 
root oi 3r — d times the size of the complex part (see the remarks following (20.8)). So 
it will be approximately ^f^^n divided by ((2//^n)(2//n))"^^^, namely |//. The same holds 
for y. Hence the expected increase in d, given that r increases, is |//. And the derivative 
of |//^n with respect to /i is indeed |// times the derivative of |//^n. 
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In order to carry out a rigorous proof as increases from n~^/^ to rT^I^ to rT^I^ and 
so on, we need to track the full asymptotic spectrum of the behavior of r and not using 
just the leading terms. It turns out that r and d are approximately given by the following 
joint functions of /x and u, whose asymptotic series can be computed from (23.10): 

2 2 

d,= ^^^-^-;--(^ + -^\. (23.13) 

Notice that the numerators of both and are divisible by (/i + cr)n, so and are 
multiples of the formula {fj, + a)n/{l + ji) for giant component size (23.11). The quantity 
11 + a can also, incidentally, be expressed as ln(l + //) — ln(l — a). 

These values and also have a surprising relation to the confluent hypergeometric 
series F{z) = F{1; 4; Az) of (7.5). It is not difficult to check that 

nif^ + ^y^) - JiTTW [~^^) ' (i-^)(/^ + ^)^ [~^^) ' ^ ^ ^ 



^F((/. + a)/4) _ d, 



(23.15) 



The quantities and are not the exact expected values of r and d. Indeed, the 
exact expected values are rational numbers, when m and n are integers, while a is always 
irrational when n is rational. But we will prove that the distributions of r and d are 
approximately normal with expectations and d^. 

Before we can prove such a claim, we need to improve the estimate of e^d in (7.16), 
because that estimate was derived only for fixed d. 

Lemma 8. Let F{z) be the function defined in (7.5). If r —> oo and if d varies in such a 
way that d/r 0, the polynomial -Pd(r) = [z'^] F{z)'^^^'^ satisGes 

P,„.^(^(..o(^)). P3.:a) 

where s is the solution to 'dF{s)/F{s) = d/{2r — d). 
Proof. We have 

1 i^(^)"^-' dz _ 1 
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where f{z) = In F{z) — {d/{2r — d)) Inz, integrated on the circle \z\ = s. By hypothesis, 
'&f{s) = 0. Using the expansion formula 

/(^^*) = E ^'/(^) + / n- ^"^V(«e*--) dx (23.17) 
with n = 2 and t — i9, we obtain 



f{se^^)^f{s)-^eH^f{s) + o{eh) 

because \'3^f{se^^)\ = 0{s). If d — > oo, the contour integral is 

exp((2r - d){f{s) - \e'^d^f{s) + 0{9^s))) dd 



1 

2^ 



TT 



-1 f-nVd 

= y= / exp{{2r-d)f{s)-f/2 + 0{t^d/r) + 0{t^d-^/^))dt 

= (1 + 0{d/r) + 0{d-'/')) , (23.18) 

because ^^f{s) =s + 0{s^) = d/{2r - d) + Oid^r^). The terms 0{t^d/r) and 0{t^d-^/^) 
can safely be moved out of the exponent because they are bounded when \t\ < d^^^ and 
1^1 < yj^jd. Larger values of \t\ are unimportant in the integral because of the factor 
e~*^/^, and because the relation 



F{z) = Z ril-ufe^^'^du 
Jo 



implies that \F{z)\ < F{^z); once the real part is sufficiently small, we can neglect the 
remaining part of the path. 

Equation (23.18) does not match (23.16) perfectly, although it would be sufficient for 
the applications considered below. To derive the sharper estimate claimed in (23.16) when 
d is small, we can apply (23.17) to f{z) — z instead of to f{z), obtaining 

f{se'^) - se'^ = fis) -s-ies + 0(9^'^) ; 

f{se'^) = f{s) + s{e'^ -ie-l)+ 0{e^s^) 



2r — d \ r"^ 
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The contour integral without the O term can be evaluated exactly, 



— j exp{{2r-d){f{s) + {e'^ -id-l)d/{2r-d)))d9 



\2r-d 



The O term contributes a relative error of 0{d/r), because we have 

/ \exp{{e'^ - 1 - ie)d)\e'' de ^ f e^^°^^-^^'^e'^ de 

J —TT J —ir 

< r e-'''"''eUe = o{d-^/^), 

J —IT 

where c = 2/7r^. □ 

Theorem 13. The joint distribution of the excess r and dehciency d of a random multi- 
graph with m = ^{1 + n) edges is approximately normal about the expected values 
and d^ in (23.12) and (23.13), with zero covariance. More precisely, there exists e > such 
that if 

r = rfj, + p\J ii^n, cZ = + 5 , (23.19) 

tie probability that a random multigraph has excess r and dehciency d is 

(23.20) 

when n~^/^ < A* ^ ^ n — > oo, uniformly for \p\ < | \fr\4^ and \8\ < | ^Jn^x^. 

Proof Before proving formula (23.20), we can verify that its leading factor yields total 
probability 1 when integrated over all values of r and d near and d^: The integral 
over d gives a factor of ^/4^^|J^n/3, and the integral over r gives a factor of ^y207^|J,^n/S. 
Let r and d be given by (23.19); the probability of excess r and deficiency d is then 

2-m! n! e,, (2 - r(.))--+-r(.)"+3-'^ 

J : _, ...-^.-^^1/^ • (23-21) 



n2'«(n -m + r) 12"-"^+^ ^ J (l - T(2))^''~''"^^^^ 

We find the coefficient of 2;" by evaluating a contour integral as in (10.11) and (21.4); it is 

_L^,»W(l_,)./.^|, (23.22) 
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g{z) = nz + (3r — d) (in z — ln(l — z)) + r ln(2 — z) — mln z + {n — m) ln(2 — z) . (23.23) 

The key to this theorem is the fact that, when p = 5 = 0, there is a saddle point at 
z = 1 — a: 



n 2{1 + ii) \ 1-a a 1 + a 

- } + }-\= 0. (23.24) 

2(1 -ct) 2(1 + (j) ^ ^ 

Moreover, g"{l — a) = bjin + 0{ji'^n) in that case. If we integrate on the path z = 
1 — a + it/^/Jm, as we did in Lemma 7 (section 21), the logarithm of the result will be 



where r = and d = d^j,. The relevant quantity s needed in Lemma 8 is 

s = ^ (23.25) 

because of (23.15). The evaluation of the stated logarithm is tedious, but it can be done 
in a reasonable amount of time with computer assistance, using some simplifications such 
as 

, a(u + a)'^ l-cr^ 

3r — d = — z — n, n — m + r = — n . 

2(1 + ;u) 2(1 + ;u) 

The term ln((6r - 2d)!/(3r - rf)!) from (7.3) can be evaluated as (3r — d) ln(3r — d) — 
3r + d + (6r — 2d,) ln2 + | ln2 + 0{iJ,). It is not difficult to verify that the terms involving 
nlnn cancel. There are three terms involving nlnfi, namely (3r — d)lnij,^, —din//, and 
— (2r — d) in^u^, coming respectively from within expansions of (3r — d) ln(3r — d), —dins, 
and — (2r — d) ln(2r — d); there are two other terms, — (3r — d)lna from within g{l — a) 
and +(3r — d)lna from within (3r — d) ln(3r — d), which also cancel. The most difficult 
part of the computation is the sum of about 16 terms that are rational functions in fj, 
and cr, times n; these too sum to zero, using relations (23.14). The net result is that the 
complicated logarithm sums to In 3 — 2 In 2 — In tt — | In 5 — | In// — In n+0(/i)+0(/i~^n~^); 
this proves the theorem when p = 5 = 0. 

For the case of general p and 5 the calculations are similar but even worse. We now 
choose the integration path 

z — 1 — a — ^ p/ ^/Jm + it/ ^/Jm] (23.26) 



the first-order effects of p and 5 then cancel out, and the second-order effects contribute 

Ap2_ 3 



~o(\P^ — -7 5"^ to the logarithm of the result. □ 
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24. A waiting game. Now let's consider a little game. Start with an empty multi- 
graph and add edges repeatedly at random until either (1) two different complex compo- 
nents are present; or (2) the multigraph is unclean. Case 1 represents the event "we have 
left the top line of Figure 1 before leaving the top line of Figure 2." 

Let Go{w,z) be the bgf for all multigraphs such that the game has not yet stopped. 
Then . . 

E^S^[^"^^"]^oK^) (24.1) 

m 

is the expected running time of the game. We have 

Go{w, z) = e^^^^)/'^ J2 w^'Kriwz) , (24.2) 

r 

where Kr{z) generates all clean cyclic multigraphs, weighted by the probability that they 
will arise as the cyclic part of a multigraph occurring during the game. 

We learned in section 17 how to compute weighting factors that account for the history 
of transitions in Figure 1 among clean multigraphs; and we learned more specifically in 
section 20 how these coefficients multigraph gains random edges. In consequence, 

we can conclude that Kr{z) = krT{z)'^^ / (l — T{z))^^~^^^'^ , where ki = ei = and the 
later coefficients obey the rule 

kr+i = ^rkr . (24.3) 
Here's why: Given k^.T'^^ /{I — j')3'^+i/2^ ^j^g generating function for a clean vertex x is 

|A;^T2'-+V(1 - Tfr+^''^ + 3rA;^T2'-+V(l - T)3'"+V2 ^ (24.4) 

where the first term corresponds to cases where x is in the unicyclic part. Similarly, given 
the generating function |/c^T^'^+^/(l — T)"^^+^/^ after x is chosen to be unicyclic, the 
generating function for a clean unicyclic y is 



5 1 
2 2 



/c^T2^+V(1 - T)3^+9/2 ; (24.5) 



here | = 1 + 1 + |, for choosing y on the half-edge to x, or on the self-loop attached to 
that half-edge, or in a different unicyclic component. We obtain a new bicyclic component 
if and only if both x and y were unicyclic. Therefore the generating function for cases 
where the game continues is 

((3r + f )(3r + i) - f) /c.T2-+V(1 - Tfr+9/2 

As in (20.9) and (20.10), we multiply by (1 — T)/{6r + 6) to account for merging {x,y) 
with the existing edges. This proves (24.3). 
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Equation (24.3) implies, of course, that 



Comparing this to the case d = of (7.16), we have 

kr = ^er{l + 0{r-')). (24.7) 

Therefore the similar calculations of section 22, where we found that pi{n) — p'i(n) = 
1 —P2 ('^) ~ I ) tell us that the game will stop in Case (2) with probability || . This provides 
further evidence in support of the top-line conjecture that was made in section 18. 

We can now try to compute the expected time for the game to be completed, but it 
appears to be quite complicated. The contribution to (24.1) from a given m and r can be 
obtained by changing to kr in (13.17) when m and r are not too large; this means we 
want to evaluate 

^ & a3^/W ( 1 , 5 (ly jr-iy. \ 



k>0 



^T{l/2-2k/3) ^36V2y r(r + 1/2 - 2A;/3) 



in place of (14.1), representing e^^/^ times the probability that the game is still alive after 

_5_ 
72 



m edges. The inner sum is known to be ^ times 




r>0 



r! 1 " ^1 1. ^ - 

J- 1 -L 1 ~ 



2 J T{r + 3/2- 2k/3) r(3/2 - 2/c/3) V' '2 3 '2 



, (24.9) 



r(l/2-2/c/3) V V4 37 V4 3 

so it has the value ^/n when k = 0. (Here, as usual, iIj{z) = T'{z)/T{z).) Further study of 
(24.8) should prove to be interesting. 

25. Waiting time in general. Bivariate generating functions provide a useful tool for 
studying the "first occurrences" of particular graphs or multigraphs, as shown in [14]. The 
special problems considered in that paper can be put into the following general framework. 

Let S be any collection of multigraphs, with bgf S{w,z). Suppose we wish to study 
the first time that an evolving multigraph on n vertices does not lie in S. If [2;"] S'(0, z) = 0, 
the empty graph on n vertices is not in 5, so the process never gets started. Otherwise, 
the probability that an evolving multigraph lies in S when it has m — 1 edges but not when 
it has m is 

2"^-i(m- l)!n! 



^2m 



[^m^n] ^^^2 _ 2^^)S{W, Z) . (25.1) 
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The proof is simple, by definition of the operators 'd^ and -d^^ because the probabihty in 
question is 

[w^ 'z^] S{w, z) - [w^z^] S{w, z) . 

For convenience we shall write 

\/ S{w,z) = {w^l-2'd^)S{w,z); (25.2) 

we call VS the bgf for "stopping configurations," while S itself is the bgf for "going 
configurations." 

The operator introduced in [14], is 



Tl 



m=l 



Equations (25.1)-(25.3) imply that ^n'^S{w, z) is the probability that a stopping config- 
uration will be encountered when some edge is added to an initially empty multigraph. 
A similar operator 



F{W, Z)^Y1 , ^n(n-l)/2^ ["'"^^l ^iw, z) (25.4) 

for graphs instead of multigraphs is considered in [14], but we will restrict consideration 
to multigraphs for simplicity. (As one might expect from section 6, we should use the 
operator 

V = widl -d^- 2d^) - (25.5) 

in place of V when defining stopping configurations for the graph process.) 

Several examples will help clarify these definitions and demonstrate their usefulness. 
Since the bgf G{w,z) for all multigraphs satisfies -^iG = 2w~^d^G^ equation (4.2), we 
have VG = 0; this, of course, is obvious, because there are no stopping configurations 
when all multigraphs are permitted. 

Example 1. Let S{w,z) be the bgf for all multigraphs having nothing but self-loops. 
Clearly S{w,z) = e^^ , because is the bgf for a single vertex with nothing but 

self-loops. Formula (25.2) now tells us that 

VS{w, z) = wz'^e'^e'^^'^ , (25.6) 
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because = z^e^S + ze'"/'^S and dy,S = ^ze^/'^S. Thus, by (25.1), the probabihty 
that an evolving multigraph first fails to lie in S when it acquires the mth edge is 

2'""^(m - l)!n! r ™ 2 w ze""'^ 
^-^ \w^z^\ wz^e'^e^" 

= \w ] 

n2"» ^ J (n-2)! 

^ 2— Hm-l)!n(n-l) ^ ^^^/^ ^ n(n - 1) _ 

And sure enough, n^~"^ — n""^ is obviously the probability that a sequence of edges 

{Xl, yi) ... {Xm, Vm) will have Xi=yi, Xm-l = Vm-l, Xm 7^ Vm. 

Example 2. Let S{w, z) be the bgf for all acyclic multigraphs, namely e^^^'^^ = e^(^*)/"'. 
The formulas 

'd^U = w-^T, ^lU = w-^T/{l-T), ^^U=\w-^T'^ (25.7) 
were derived in section 4, and we have 

^le^ = {-dl F)e^ + {-d^ Ffe^ (25.8) 

for any F = F{w, z); hence 

Ve^ = ^ . (25.9) 

These are the stopping configurations that define the appearance of the first cycle in an 
evolving multigraph. The term T'^e^ corresponds to a first cycle of length k; therefore 
if we replace by kT^ and sum over all stopping times, we get an expression for the 
expected length of the first cycle, 

*n^Y^e^. (25.10) 

This was one of the main problems studied in [14] , where it was shown that the expected 
length is proportional to n}^^ although the standard deviation is proportional to n^/^. 

Example 3. Let S{w, z) = U{w, z) be the bgf for unrooted trees. This is a perverse 
example, thrown in primarily because (25.7) gives us the information we need to calculate 

rp rr\2 

vu = 

1 -T w 

= wz+ {-w + 2w'^)z'^ + {-2-0? + I w^)z^ + • • • . (25.11) 
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What is the meaning of these negative coefficients? 

The example does make sense, if we rephrase our interpretation of (25.1). The exact 
meaning of 

2"^-^(m - l)!n! 



is, "the probabihty that an evolving multigraph leaves S when the mth edge is added, 
minus the probability that it enters S when the mth edge is added." In our example, 
U (0, z) = z\ when there are two or more vertices, the empty multigraph is not a tree, 
but it can become one later. The bgf for becoming a tree is w~^T'^, corresponding to an 
ordered pair of rooted trees with m — 1 edges. The bgf for adding a new edge {x, y) to a 
tree is 'Ylfk>i '^^^ where the term T'' corresponds to cases where x and y are at distance k. 
(Each appearance of T = T{wz) includes an implicit edge touching the tree root, because 
w and z appear with equal powers in every term.) 

Example 3 cautions us to interpret the operators V and a bit more carefully. In 
general, we have the identity 

^n^Siw, z) = n\ [^"] S(0, z) - lim ^ [tu^^z"] Siw, z) , (25.12) 

m— >oo n '"^ 

for any bgf S{w,z) such that the limit exists, because 
^nVS{w,z)^\^ { [w^ \^]S{w,z)-—^[w^z^]S{w,z)\ . 

m=l ^ ^ 

A sufficient condition for the limit to exist is that the coefficients of V5'(w, z) are nonncg- 
ative. A sufficient condition for the coefficients to be nonnegative is that S{w,z) should 
represent a family of multigraphs S with the property that the deletion of any edge pre- 
serves membership in S. 

Example 4. Let S{w, z) = G{w, z) — C{w, z) be the bgf for all disconnected multigraphs. 
The stopping configurations now represent the first time an evolving multigraph becomes 
connected. Since G{w, z) — e^^'^'^\ we have 

d^C = d^\nG={d^G)/G; 
d,C = dA^G={d,G)/G; 
die = {dlG)/G - {d,Gf/G^ ; 

hence 

V5 = VG - VC = w{d^Cf . (25.13) 

Of course! This is an edge that joins an ordered pair of vertices marked in distinct com- 
ponents. 
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Example 5. Let S{w, z) be any bgf of the form 

S{w, z) = e^(«''^)+^('^'^>iy(«;, z) . (25.14) 

Then we can use (25.7) and (4.9) to compute 

V5 = e^+^ ((2m - 2i?«; + we-"^^le^) H) . (25.15) 

For example, when S{w,z) = G{w,z), the left side of (25.15) is zero, and H{w,z) is the 
bgf we have called E{w, z). Equating the right side of (25.15) to zero gives the differential 
equation (5.1) that we originally used to compute E{w,z). 

In the special case H{w,z) = 1, the stopping configurations correspond to the first 
time an evolving multigraph acquires a bicyclic component, i.e., the time when its excess 
changes from to 1. This is another problem that was considered in [14], where it was 
shown that the expected number of unicyclic components present at the time is | In n + 
0(1). If we express H in terms of univariate generating functions, 

H{w,z) ^^w'^Hriwz) , (25.16) 

r>0 

then (25.15) can be written 

VS = e^+^ Yl w^'VHriwz) , (25.17) 

r>l 

where the univariate function Hr{z) is related to (5.3): 

VHr{z) = e-^'&'^e^Hr-i{z) - 2(r + (1 - T){f)Hr{z) . (25.18) 

Example 6. Specializing Example 5 further, let 

R 

S{w, z) = e^(«''^)+^(^'^) J2 w''Er{wz) , (25.19) 

r=0 

where R is any nonnegative integer. Then the stopping configurations VS represent the 
time when an evolving multigraph first acquires excess -R + 1. Expression (25.18) becomes 
almost trivial because VHr is zero for all r 7^ i? + 1; we have 

VS{w, z) = ^-R+ie^^^'^) dle^^'^^^ERiwz) . (25.20) 

This family S has the property that ^^VS" = 1, by (25.12), because a multigraph 
surely acquires excess i? + 1 at some time m < n + i? + 1. We can write the identity 
$jjVS' = 1 more explicitly, using our known formula for E^, and using r in place of R: 

2r rp( \2r—d \ 

^ Z%r-...l. = 1 . (25.21) 

d=o (l-T{wz)) J 
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for all n > 1 and r > 0. Moreover, we can write (25.20) in the form 

VS{w, z) = 2^^-R+ie^(^'")+^("'^) (i? + 1 + (1 - T)^^)Er+i{wz) , 
using (5.3). Setting r = R+1 and applying (20.9) gives us another way to express (25.21), 

(2r T-i/ \2) — d \ 

for all n > 1 and r > 1. 

For example, the case r = 1 of (25.22) is 

*- {^'" (5 (T^ + \ JT^) ) = 1 ■ P^-^^' 

The operator is defined in (25.3) to be a sum over m, and the mth term of (25.23) is 
2^-^{m - l)!n! 



1 2"^m! nl 



2m{n — m + 1) n^'"(n — m)! 
1 2"*m!n! 



[^-] U{z)"+'f{T{z)) 
where /(T) = |T7(1 - T)^/^ + iT/(l - T)^/^ and g{T) = (2 - T)Tf(T). We can write 



(l_T)9/2 (l_T)7/2 (1-T)V2 (l-T)3/2 (1 - T)V2 ' 

so we can evaluate (25.24) by summing five applications of formula (10.1). The value is 
negligibly small unless m is |n + 0(n^/3), hence the factor 4m(n — m + 1) can be assumed 
to equal + 0{iv'/^). The five terms of g yield values of order n'^/^, n, n^/^, n-'^/^, and 1 
respectively, according to (10.1); thus the leading term |/(1 — T)^/^ must be responsible 
for the major contribution to (25.23), and the mth term of the sum when m — |n + |//n^/3 
will be 

|n-2/3V2^^(|,^) + 0(n-i). 
Summing over m yields 1. Therefore it must be true that 

8/5 



ji 1 dji 



This integral formula is not at all obvious from the definition of A{y^ii) in (10.2), and it 
would be interesting to find a direct proof. 
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The argument we have just given can be extended to arbitrary r, starting with (25.22), 
and it imphes the following remarkable result: 

/oo 2 
A{3r +^,ii)dii= -= , integer r > 1 . (25.25) 
-oo 3T'6j~'v27r 

By (8.17) we can also write 

r°° 1 /2\'' T(r) \/27r 



We have just proved that, if Mr^n = |n + |C/r,n'T'^'^^ is the number of edges when the 
excess first reaches r, then 

Pi{Mr,n^m) = 6rerV2^A{3r+^,n)n~^/^ + 0{n~^); (25.27) 

hence Ur,n Ur in distribution, where Ur has the density function 

frin) ^ 3rerV27T A{3r + I , ij) , -00 < < 00 . (25.28) 

Combining this formula with (13.17), we have 

V2nerA(3r + i ,/x) = lim Pr{Sr) = lim Pr(M^„ < m < M^+i n) 

= / (/r(w) - /r+l(w)) du, 
J — 00 

whence 

v^e, ^'(3r + I , /x) = /r(/^) - /r+i(/x) . (25.29) 
In fact, (25.29) can be derived also by setting j/ = 3r + | in the formula 

^'(y, /^) = (y - |)^(y + 1, /^) - | y(y + 2)A(y + 4, /.) , (25.30) 

which is a consequence of (10.22) and (10.23). 

26. Continuous excess. Let /(y) be the integral in (25.25) when the parameter r is not 
necessarily an integer: 

/•oo 

/(y)- / A(y,//)d//. (26.1) 



It is natural to conjecture that formula (25.26) holds for y in general: 

r,. ^ 2^/3 r(y/3 - 1/2) 3 

^2/^ 32^/3+1/2 r(y/3 + 1/3) r(y/3- 1/3) ' V ^ 2' ^ 

(The condition y > | is necessary and sufficient for convergence of the integral, because 
of (10.3) and (10.4).) And indeed, this conjecture is true. 
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Theorem 14. The integral (26.1) has the closed form (26.2). 

Proof. Let Io{y) be the right-hand side of (26.2); we wish to show that I{y) — /o(2/)- 
Clearly 

«» + 3) = ^^^fj^/„te). y>l. (26.3) 

Since J^^A'{y,n)dn = for y > i, by (10.3) and (10.4), we can integrate (25.30) and 
replace y hj y — 1 to get the same recurrence for I{y): 

n^/ + 3) = ^^^fj^/(.). y>l. (26.4) 

Therefore I{y)/Io{y) is a periodic function, and we need only prove asymptotic equivalence 
I(y) ~ Io{y) as y — > oo in order to verify strict equality I(y) = Io{y) for all y > |. 

The duplication and triplication formulas for the Gamma function provide us with an 
alternate expression for Io{y): 

, , f9y~^ r(22/-l) 1 /2ey , , 

To show that I{y) has the same asymptotic behavior, we break the integral into two parts, 

/O t-oo 
A{y, ii)d^+ A{y, /.) d/i = I-{y) + I+{y) . (26.6) 
-00 ^0 

By definition (10.2) we have 

_ 1 r ^ e-^'/6(l3VV)'c^^ 

^+(y)-3(,+i)/3 1. fc!r((y + i-2/c)/3) ' ^^^•'^^ 



we will show that the asymptotic value of I+{y) can be obtained by interchanging sum- 
mation and integration, then estimating the resulting sum. 
Let ttk be the A;th term after integration, 

_ e-''"/^{^3y^fifdfi _ 2(i-2'=)/33^-2/3r((A;+l)/3) 

~ Jo k\T{{y + l-2k)/3) ~ k\T{{y+l-2k)/3) " ^^^'^^ 

If Gk — then afc_|_3 = 0; otherwise we have 

afc+3 {2k + 5-y)i2k + 2-y) 



au 4(A; + 2)(A; + 3) 



(26.9) 
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which is greater than 1 when k < ^ — less than 1 when k exceeds that 

value, and nonnegative except for one or two values of k near |y. So the largest terms ak 
occur when k is near jy. If j/ > 5 and k > y/2, we have 



ttk 



< 1- 



y - 5 
2k 



(5-J/)/3 



and it follows that = 0(/c'^^~^)/^) as — cxd. Therefore ^ |afe| exists, and the inter- 
change of summation and integration is justified, at least for large y. 

Let k = ^y + X, where < Then Stirling's formula tells us that 

Inofc = — — ln2 + — — ln3 — Iny + - - In V^r - — + Od/"" /). (26.10) 

o o D o 6y 

If < e < |, this implies that the sum of all terms for |a;| > y^l"^^^ is superpolynomially 
small in relation to the sum of terms for |a;| < y^/^"*"^; hence 



oo 
A;=0 



2(j/+3)/33(2y-2)/3gy/2 



y 



(22/+3)/6 



and we have 



K)73 



3(J/+i)/3 



fe=0 



^o(y) 



(26.11) 



The proof of (26.2) will therefore be complete if we can show that /_(y)//+(j/) — > 
as J/ — > oo. For this we can use (10.9) to show that 



A{y,-a)<^a^l''-y 



1 + 



it 



a 



3/2 



a 



1/2-?/ 



27r 



therefore the first portion of /_ (y) is quite small, 

a^/^-yda = 0{y-^/^-y/^) 



r-V ■ I 

j A{y,n)dn< 

J —oo 



-\/27r Jyi/3 



On the other hand when —y^^^ < < we can integrate (10.7) from y^^^—ioo to y^^^+ioo, 
obtaining 



1 

2^ 
1 



\y 



< _2/(i-2/)/3e 



J — c 



exp 



,1/3 , ^^ ,2 



(it 



y(i-2')/3exp(y/3 + /x(3y2/3-//2)/6) ^ ^1/6 ^ey/^ 



v/27r (2y 1/3 + ^) 
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hence 



and /_ (y) <S /+ (y) as desired. □ 

Theorem 14 sheds further hght on the results of [14] , where the first cycle of a random 
multigraph was shown to have average length asymptotic to ^/n/2 1{2) n^/^. According 
to a lengthy numerical calculation sketched there, this coefficient was determined to be 
2.0337, correct to four decimal places. Sure enough, equation (26.2) now confirms that the 
exact value is 

^ [ = 2.03369 20140 63898 89186 17247 01028 49830 16693- . (26.12) 

2 /" 3 ' 

Section 7 of [14] also proves implicitly that, if the random variables L and S are 
respectively the length of the first cycle and the size of the component containing that 
cycle, we have 

E,L^ ~ y|A;!/(A;+l)n'=/3-V6. (26.13) 
E^^'^ - 2'=-i/2r(/c + i)/(2/c + l)n2'=/3-V6 (26.14) 

In particular, the variance of L is asymptotically -\/27rn; the asymptotic mean and variance 
of S are ^/Trn/2 and Kn^/^, where K is the constant in (26.12). For graphs instead of 
multigraphs, these coefficients should all be multiplied by e^/^. 

Notice that /(3) = 1. Hence the function ^(3,//), which is expressible in terms of 
Airy series or Bessel functions (see (10.32)), defines a probability density. 

Let Vy be a random variable with density function A{y, fi)/I{y), when J/ > |. Then, 
by (10.22), 

_ r f^A{y,^^)df, _ yI(y + 2)-I{y-l) _ (y - 3)I{y - 1) 



if J/ > |. In particular, the variable Ur of (25.28), which is Vzr-\-z/2i has the mean value 

(26.16) 



(3r - 3/2)/(3r + 1/2) /3\ r(2r - 2/3) 



(3r- l/2)/(3r + 3/2) \4. J r(2r - 1) 

This is the limit as n — > oo of Et/^^^, which represents the mean waiting time for a graph 
or multigraph to reach excess r. The values are 0.8113, 1.2621, 1.5191, 1.7104, 1.8666, 
2.0002, 2.1181, 2.2241, 2.3209, 2.4102 when 1 < r < 10. 
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Similarly, (10.23) implies that 

ET.2_ /(y-2) _{y-2) r((2y-7)/3) , 



Hence EF, = (y/2)V3(i _ + 0(y-2)), Ei;^ ^ (^/2)2/3(i _ + o(y-2)), and we 
have 

VarF, = -A_ ^-1/3 + 0(^-4/3) _ (26.18) 
Let us now set /x = + az, where 

An argument similar to the derivation of (26.11) proves that 

Therefore (Var Vy)~^/^(V^ — EVy) approaches the normal distribution Ar(0,l) as j/ — > oo. 
In particular, this establishes a kind of asymptotic normality of Ur,n (and Mj.^„), if we first 
let n — > oo and then r oo. 

27. Proof of the top-line conjecture. We are almost ready to settle the conjecture 
that was made in section 18, but first we should carry out the promised refinement of our 
estimates (23.5) and (23.9) for the sizes of the acyclic and unicyclic parts of a random 
multigraph. 

The first step is to consider the quantity (23.3), when m = |n(l + /i) and k = nn. If 
k > m or k > n, expression (23.3) is zero; otherwise < k < min(i^, l), and Stirling's 
approximation yields 



u Im—k I n—k 2^ m-n- f n—k\'^^ ( ^, x ^/1\ ^/l 
e V \ 7 rw =exp n/ («,//) + +0 



m \ n {n—kY^ \ n J \ ' \m—k ) \n—k 

'(27.1) 
where 

/(k, ^) = i±il ln(l + (1 + - + ^ _ 2k) + (^ - k) ln(l -«)-«. (27.2) 

Notice that 

^^ = ln(l + ^-2.)-ln(l-.) + ^, 

OK 1 — K 

^ (1 -/x)(/x-«) 
dK^ (1 + /x-2k)(1 -k)2 ' 
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so both first and second derivatives vanish when k, = /i. The first derivative is < when 
K = 0; if < /X < 1 it increases to zero when k = fj,, then becomes negative; if /i < 
or > 1 it decreases steadily. Thus /(k, fi) is a decreasing function of k, as claimed in 
section 23. 

We also have 

~^jr~ - 2 2 + - ; 

this derivative decreases steadily, passing through zero when ji = k/{2 — k). Therefore we 
have 

/(«,rt</(.,^)=-(^ + ^ + ^ + -+-(-^«^ + -), (2T.3) 

for all fj, > 2k — 1. In particular, we can conclude that terms like (23.1) and (23.8) 
are superpolynomially small for all k > n^/^"*"^, since they are 0(exp(— n^'^/24)) when 
k = 

Our next goal is to estimate the sum of (23.8) for A; > 1 when /j, > n~^/^. This sum 
V{m, n) is the expected number of vertices in unicyclic components after m steps of the 
multigraph process. The formulas above allow us to write 

A;<n2/3+e ' ♦ 

V- 1 k'^Qik) / . , , flk'' k^ 



+o(^ + ^ + -]] + 0{e--') . (27.4) 

Let = an~^/"^, so that a is the quantity we called n in sections 10-20 above. We 
will assume that a > 1, and also that a < cn^^^ (hence ij, < c), where c is a sufficiently 
small constant. The terms of V{m, n) are negligible for k > regardless of the value 

of and when n~^/^ < fi < c we can in fact ignore all terms for k > a^ji~'^. The reason 
is that 

Min(i + rt-/') + ^-£^ = ^l7 + i(5-£:) ]+0(k^^) 




(1 + 0(m))< 



if we choose c small enough. The sum of 0{e-^' '^/loo) for a7//2 < k < oo is then 
0(//~^e~'*^/^°°), which is dominated by the error bounds we will encounter below. 



Ill 



When k < a^ji ^, we have nk'^/2n < < 1/2. Therefore we are justified in 

moving terms out of the exponent in (27.4): 



k>l ' ^ 



2n 6n^ 



+0 - —] ' + O + — + - 

fc>l ■ ^ ^ 

Here u is the "shadow" of n as in (23.4) and (23.10), and the error bounds are computed 
under the assumption k < /jL^. The trick of (23.5) and (23.9) now apphes, using (23.7), 
and we have 

If we had expanded the summand further, we would have obtained still more accuracy; 
therefore we are allowed to set e = in (27.6). The term 0{a~'^n~^/^) dominates 0{a~^) 
when a > n^/^^; it comes from both 0{^'^k'^/n) and 0{k/n) in (27.5). 

We are assuming that /x is smaU, hence a = /x(l + 0(^)). Thus (27.6) can be simplified 

to 

and with an extension of the same approach we obtain an asymptotic expansion that begins 
T./ N 1 2 20 320 7040 ^ / 1 \\ o/s/. ^^ 

This expansion is readily computed if we note that 

, T{z) _ 2"^. 

' (r3^-(i-T(.))--^-' 

where the remaining terms a^i/ (l — T{z)^'^^^^ ^dk^l (l ~" ^(-2))^^ + ' ' ' ^"^^ negligible when 
we replace T{z) by 1 — // — 0(//^). The asymptotic series in (27.7) is obtained also from 
the integral 

/>00 r-OO 

1 g-a^V2W/2-t76^^^ 1 / g(a-t)76-a76^^^ (27.9) 

Jo Jo 
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because we can expand e"*^/^ into powers of t and use the formula 

j|~e-"W*=?^, (27.10) 

which matches (27.8). The coefficients of (27.7) follow a simple pattern; for example, 
7040 = 22- 16 - 10-4/2. Thus we are led to conjecture the asymptotic series 

/•OO 

/ e('*-*)'/^-'^'/6dt ~ 2F(|,l;;6/a3)/a2 as a^oo; (27.11) 
Jo 

the right-hand side here is a formal power scries that diverges for all finite a. And indeed, 
this conjecture is true, as we will sec momentarily. 

A similar calculation allows us to estimate U{m,n), the number of vertices in trees. 
The analog of (27.5) is 

C/(m, n)^^J2^ {il-a)e-('-^^f + ^ + 0(/ca^^-^n-^/^) + Oika^n-')^ ; 

(27.5') 

we leave a factor of k in the O terms because it will lead to a better final estimate. Then 
the analogs of (27.6)-(27.10) are 

C/(m, n) = j^(^l-a + 0{a'^-'n-'/') + 0{a'^-'n-^/') + £ (^^^ ^ ; (27.6') 

, ( ^ 1 11 175 19005 735735 

C/(m,n) =n+ -2a + 7-^ + — ^ + ttt-^ + -r^-pr + 



2^2 8a5 16q!8 128aii 256a 



+ 0(^^))n2/3(l + 0(/x)); (27.7') 



^ / —1^0 — = -y= / 7^ dt + a; 27.9' 

1 r e-^'V2t^-^/2dt= ^'"'^^^^~_y^^ , A;>1. (27.10') 



27r Jo V a 



The asymptotic series (27.7) and (27.7') for a ^ 00 blend perfectly with the results 
obtained in [28] when a is any constant (positive, negative, or zero): 

V{m, n) = ^ (^J^ ^{a-tf/6-aV6 ^2/3 ^ q^^1/S^ . (27.12) 

U{m, n)=n+i-a+ --j= J dt \ n"/"^ + 0(ni/3) . (27.12') 
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These integrals are entire functions of a, 



JO 3 

I ^372 dt = -e-^/^((6V^r(|)/3) |; |, |; a^/g) 

-(6V^r(i)/6)«F(|, I; |, |;«76) 

+ (6V6r(i)/8)a^F(I,|;|,|;a76)). (27.13') 

Equation (27.13) is proved by observing that if g{a) = e^'^~^^^/^ dt, then 5''(a) = 
joo {a t) g(a-t)3/6 _ e"^l^. It implies (27.11) by weU-known properties of confluent 
hypergeometric series. Equation (27.13') is proved by setting h{oL) — J^ie^'^~^^ — 
ga3/6) ^-3/2 _ _J^°-(tt _t)2g(a-t)3/6^-i/2^^ proving that h'" (a) = la^h'^a) + 

f a/i'(a) + f /i(a), hence [a^+^] h{a) = [a''] h{a){k + ^){k + / {2{k + l){k + 2){k + 3)) . 
Recall that we enumerated n — U{m,n) — V{m,n), the expected number of vertices in 
complex components, using a complementary approach in (15.13), by summing over the 
excess r. 

Lemma 9. Let Vmn be the number of vertices in unicyclic components of a random 
multigraph with m edges and n vertices. If m = |n(l + fx) and > n~^l^ , the expected 
value ofV^^ is 0{ii~'^^), for every fixed integer 1>1. 

Proof. Equation (27.7) proves this for Z = 1 and n~^/^ < jJi < c, where c is some positive 
constant. A similar argument applies for arbitrary I, because the generating function 
i?'e^(^) is e^(^)/(l - T{z)) times a polynomial in T{z); this means we are summing 
terms like (27.5), but with Q{k) replaced by a semipolynomial in k of degree I — \- (See 
the proof of Theorem 3 in section 8.) The analog of (27.6) will then be 0((J~^'), which is 
0{iJ,~^^) if < c. Incidentally, for this range of /j, we will have 

EFi.„ = (^^^\l + 0{^^) + 0{^^-'n-')). (27.14) 

If c < < n^, with e<|,let0<5<l — ln(l + c)/c. Then each term in the analog of 
(27.4) with/c < n3/4 is 0{k^-'^ exp{k{\n{l+ij,)- n)+0{ii^P/n))) = 0{k^-^ exp{-k5n)) = 
0(k^-^e-^i^-^^''). Hence EV^^ = 0(6-^"). 

Finally, if > the value of E is superpolynomially small, for it is a sum of 
n terms each of which is bounded by a polynomial in m and n times (1 — l/n)^"^, which 
is 0{iJ,^e~^) for some finite degree d. □ 
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Corollary. The probability that a random multigraph never acquires a new complex 
component after it has gained m = ^{n + an^^^) > |n edges is 1 — 0{a~^). 

Proof. We may assume that a >1. A new complex component must be bicyclic. A multi- 
graph gains a new bicyclic component if and only if the endpoints of a new edge both 
fall in unicyclic components. The probability that this occurs at time m — + A*'^) is 
EV^Jn^ = 0(/x~^n~^), by the lemma. Summing for m > + an?/^) gives 0{a~^) as 
an upper bound on the probability that at least one new bicyclic component appears after 
time |(n + an^/^). □ 

Theorem 15. The probability that an evolving graph or multigraph on n vertices never 
has more than one complex component throughout its evolution approaches || as n — > oo. 

Proof. Let e > be fixed. By the corollary jiist proved, there exists a number a, inde- 
pendent of n, such that the probability of a random multigraph obtaining a new complex 
component after time m = |(?^ + an^/^) is less than e. 

By section 14 and the corollary of section 13, there is a number R, independent of n, 
such that the probability of having excess > R at this time m is less than e. So the 
probability that a random multigraph leaves the top line after excess i? is < 2e. (Either it 
reaches excess R before time m, or it leaves the top line after time m.) 

But the probability that a random multigraph leaves the top line before excess R is 
1 - fl + 0{R-^) + 0(n-^/3), by (18.2). We may choose R sufficiently large that this 
0{R~^) is less than e; then we may choose n sufficiently large that the 0{n~^^^) is less 
than e. The probability that a random multigraph leaves the top line for such n is therefore 
between 1 - - 2e and 1 - + 4e. 

For graphs, we note that an evolving graph may be constructed from an evolving 
multigraph by ignoring all new edges that would be loops or parallel to an existing edge. 
Since this reduction preserves or decreases both the excess and the number of complex 
components, it follows that if the graph leaves the top line after excess i?, then the multi- 
graph does too. Hence this event likewise has probability < 2e, and the proof is completed 
as for multigraphs. □ 

Theorem 16. Given any set S of infinite paths in Figure 1, the probability that the evolu- 
tion of a random multigraph follows a path in S converges as n ^ oo to the corresponding 
probability for the Markov chain with the transition probabilities given in Theorem 9. 
Similarly, if the evolution of a random graph, which stops at excess {^) — n when the 
complete graph is reached, is continued along the top line to an infinite path in Figure 1, 
then the probability that this path lies in S converges to the same limit. 

Proof Given e > 0, let i? be as in the preceding proof so that a random graph or multigraph 
leaves the top line after excess R with probability < 2e. We can also choose R large enough 
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that cu > eij(l — e), by (8.7). Since cu/eji is the sum of all Markov transition probabilities 
for paths that intersect the top line at excess R, if we cut Figure 1 at excess R, the Markov 
probabilities for paths in S that do not have this property must sum to less than e. When 
R is large enough, the sum of Markov probabilities for all paths that diverge from the top 
line after excess R is likewise less than e, because it is 0(Y1^ r~^) = 0{R~^). 

Let Pn{S) be the probability that the evolution of a random graph or multigraph on n 
vertices follows a path in S, and let Poo{S) denote the corresponding Markov probability. 
If Sfi is the subset of 5" having all paths on the top line when the excess is > R, then 
< Pn{S) —Pn{Sfi) < 2e for all n < oo. Similarly, if S'j^ is the set of all paths that follow a 
path in Sr up to excess R, but afterwards are arbitrary, then < Pn{S'j^) — Pn{SR) < 2e, 
for n < oo. Finally, by Theorem 10, \Pn{S'ji) — Poo{S'fi) \ < e if n is large enough, and we 
have I Pn (S) - P^{S) \ < be. □ 

Theorem 16 says that the evolutionary path, regarded as a random element of the 
set of all paths in Figure 1, converges in distribution to the Markov process. There are 
uncountably many paths, but the theorem needs no measurability restriction since the 
distributions for finite n and for the limit are concentrated on the countable set of paths 
that eventually follow the top line. Note that we cannot strengthen the statement for 
random graphs to deduce the limiting probability that the evolution follows a path in S 
until it stops at excess (^) — n; for example, if S is the set of all paths that do not eventually 
follow the top line, the Markov probability Poo (5) is zero, while PniS) = 1 for all finite n. 

Corollary. The probability that an evolving graph or multigraph never has more than I 
complex components converges to a limit Pi . □ 

Closed form expressions for Pi might not exist when I > 2, but the values can be 
estimated from below using the following related probabilities: 

Corollary. The probability that an evolving graph or multigraph acquires exactly I > 1 
new complex components during the evolution converges to 

p[ = Pr(f2^r^l) =Pr('f;/, = Z-iy (27.15) 

where Iq, Ii, I2, I3, ... are independent Bernoulli distributed random variables with 
Pr{Ir = 1) = 1 - Pr(/^ = 0) = 5/(6r + l)(6r + 5). 

In other words, the number of new complex components converges in distribution to 

Z^r=0 ^r- 

Proof. Let = 1 if the Markov process acquires a new bicyclic component when the 
excess goes from r to r + 1, and 1^. = otherwise; in particular Iq = 1 always. By 
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Theorem 9, Pr(/j. = 1) = 5/(6r + l)(6r + 5) independently of the previous history, and 
thus the variables are independent. □ 

The probabilities p[ have a surprisingly simple generating function: We have 



r=0 
oo 



r(i)r(|) 



r(i + i^9^)r(i-ix/9^) 

= [^'] cos QV9-5z) y cos I . (27.16) 

Computing the coefficients of the Taylor series for cos(^-\/9 — 52; ) , we find that the numbers 
are rational polynomials in tt: 

p; = — « 0.87266; 
P2 = ^ = 0.12120 1 



5^ (1-3=0.00015. 



Let PI = ELiK; numerically we have Ps > ~ 0.99387, P3 > 0.99985, P4 > 



Pi fti 0.999998. 



The number of new complex components is also studied in [19], where further results 
are given. The methods of [19] do not, however, seem to yield the sharp results obtainable 
with generating functions. 



28. Empirical data. Computer simulations of random multigraphs tend to confirm the 
theoretical results derived above, although there are a few surprises apparently due to the 
slow convergence of some asymptotic formulas. In this section we will discuss some of 
the statistics computed during 1000 trials of the multigraph process on 20,000 vertices, so 
that readers can obtain a feel for the way in which random multigraphs actually evolve 
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in practice. The data was divided into two groups of 500 runs each, and both groups 
exhibited essentially the same behavior; therefore the full set of 1000 runs is being treated 
as a unit here. 

When a statistic is given in the form 'x ± below, x is the sample mean and y is the 
sample standard deviation divided by ylOOO. The sample standard deviation has been 
computed by taking the square root of an unbiased estimate of the variance. The "time" 
of an event is the number of edges present when that event occurred. 

The first cycle was formed at time 6769 ± 96; this agrees reasonably well with the 
asymptotic formula n/3 found in [14, Corollary 3]. The size of the first unicyclic component 
was 188 ± 14. According to (26.14), the mean should be approximately ^ 177. 

The length of the first cycle was 3.9 ± 0.1; in fact, the histogram was 

length =1 2 34567>8 
actual = 321 132 89 88 78 86 60 146 
theoretical = 333 133 76 51 37 28 23 318 

The distribution has infinite mean, approximately 2.03n^/^ + 0(n^/^^), and its standard 
deviation is of order n^^'^ by (26.13), so the length of the first cycle should not be expected 
to be a robust statistic. However, the marked deviation in the histogram for cycle lengths > 
4 was unexpected. Apparently n must become quite large before the asymptotic probability 
of first cycle length k will assert itself. 

Several people have suggested in conversation that the "last cycle" ought to have the 
same statistical characteristics as the first. The last cycle is the last unicyclic component 
that is present during a multigraph's evolution: After it is absorbed into a component 
of higher complexity, no further unicycles exist, and no further unicycles are formed. (If 
two cycles disappear simultaneously when the edge {x, y) is added, we say that the cycle 
containing y was the last to go.) The manner in which the giant component swallows other 
structures is rather like the initial stages of evolution but in reverse: First the unicycles 
tend to go, then the larger trees, and finally only isolated vertices are left (see BoUobas [6, 
sections VI. 3 and VII. 1]). A strong formulation of this symmetry principle was proved by 
Luczak [25]; the phenomenon can be explained by the symmetry between T{z) and 2—T{z) 
in U{z). However, the length of the last cycle has a distinctly different distribution from the 
length of the first cycle (see [20]). In these computer runs it had the following histogram: 

length =1 2 3 4567>8 
observed = 423 144 107 79 63 62 40 82 

with mean 3.1 ± 0.1. 

The total number of unicyclic components formed during the entire evolution was 

number =1 2 3 4 567>8 
observed = 53 148 221 219 178 98 44 39 
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with mean 4.0 ±0.1. 

The excess of the multigraph changed from to 1 at time 10331 ±13. The number 
of unicychc components present was about 2.7 just before this event, and about 1.5 just 
after. As soon as the excess became positive it began a steady rise: 

unicychc size unicychc size complex size complex size 



excess 


time 


just before 


just after 


just before 


just after 


1 


10331 ± 13 


1606 ± 22 


163 ±9 





1442 ± 21 


2 


10501 ± 10 


265 ± 14 


132 ±7 


1779 ± 22 


1912 ±20 


3 


10603 ± 8 


168 ±9 


111 ±7 


2166 ± 19 


2222 ± 19 


4 


10675 ± 8 


132 ±8 


90 ±5 


2433 ± 18 


2475 ± 17 


5 


10738 ± 8 


105 ±6 


85 ±5 


2659 ± 17 


2680 ± 17 


6 


10789 ± 7 


95 ±6 


76 ±5 


2825 ± 17 


2844 ± 16 


7 


10835 ± 7 


83 ±5 


69 ±4 


2980 ± 16 


2994 ± 16 


8 


10880 ± 7 


77 ±5 


66 ±4 


3126 ± 16 


3137 ± 16 


9 


10920 ± 7 


72 ±5 


62 ±4 


3253 ± 15 


3263 ± 15 


10 


10955 ± 7 


66 ±4 


58 ±4 


3371 ± 15 


3379 ± 15 



The value of n^l^ is approximately 737 when n = 20000, so each additional edge increases 
the parameter of Lemma 3 by approximately 0.0027. The value of when m = 10955 
is approximately 2.59; then |//^ ± 1 ± ± i|a*~^ ~ 12.6, so the excess is not quite 

keeping up with the expected value in Theorem 6. Similarly, formula (26.16) predicts that 
the excess will reach 1 when m ~ 10299, and 10 when m, ~ 10888; random multigraphs for 
finite n seem to become complex a bit "late." It is interesting to note that the observed 
standard deviations kept decreasing as the excess increased, while the discrepancy from 
(26.16) kept increasing. 
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The random multigraphs followed paths in Figure 1 with the frequencies shown in 
Figure 3. When the excess changed from 9 to 10, the transition was from a single Cg 
to Cio in 977 cases, from Cg to (Ci,Cg) in 2 cases, from (Ci, Cg) to Cio in 8 cases, and 
from (Ci^Cs) to (Ci,Cg) in the remaining 13 cases. Altogether 897 of the 1000 random 
multigraphs remained on the top line of Figure 1 throughout their evolution. 




Figure 3. The number of times the paths in Figure 1 were actually 
traced, when 1000 random multigraphs on 20000 vertices were generated 
in experimental tests. 

There comes a time when the giant component first succeeds in annihilating everything 
except isolated vertices, after which it remains the only component with edges. In these 
runs that time was 58352 ± 224. The number of isolated vertices still remaining was then 
71 ±1. 

The multigraph finally became connected at time 105294±404. The expected time for 
an evolving multigraph to have no isolated vertices is ^nHn = |nlnn+|7n+ j + 0{n~^), 
which is approximately 104807 when n = 20000. 
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29. Open problems. The topics discussed in this paper raise a host of interesting ques- 
tions, and the answers to those questions will no doubt bring additional striking patterns 
to light. 

But the reader may have noticed that this paper is already rather long. Therefore 
it seems wise to stop at this point, with the hope that researchers all over the world will 
enjoy exploring the tantalizing questions that remain. 

For example, it would be interesting to find a basis for as many linear combinations 
of terms w'^T"'/ (1 — T)^ as possible such that 

^nw"" T7(l -T)^ 

has a known value, as in (25.22). We can find many linear combinations of such functions 
for which gives 0, because $nV5' is usually or 1. Notice that 

+ (29.1) 



(1 - T)b+^ (1 - T)b (1 - T)b+^ ' 

hence terms of excess r + 1 can be expressed as combinations of terms of excess r. Con- 
versely, we can go from excess r to excess r + 1, because 

rpa rpa rpa+1 rpa+2 



{1-T)b {1-Ty+^ (l-T)&+2 {1-T)b+^ 

is an infinite series that always "converges" under application of all terms after a 
certain point are multiples of T"'"'"^, so they do not change the coefficient of z^. 

The stopping configuration machinery suggests many further problems of interest. For 
example, we should be able to deduce more about the nature of a random multigraph when 

its deficiency first exceeds a given number d. 

The discussion in section 23 characterizes the stochastic behavior of r and d when 
/J, = o(l); what happens thereafter? Relations (23.12) and (23.13) may well continue to 
describe the approximate mean values of r and d as ^ oo. The shadow point a defined 
in (23.2) will approach 0, but it remains an analytic function of n, and 1 — a remains a 
saddle point of the contour integral for [^"] [/n-m+rj.2r-d/,^;^ _ rp-^3r-d+i/2 _ 

The analytic function T{z) has an interesting Riemann surface: There is a quadratic 
singularity at z — e~^, and if we travel around that point we get to a second sheet in which 
there is a logarithmic singularity at 2; = 0. Winding around that logarithmic singularity 
takes us to infinitely many other sheets having no finite singularities besides 0. It may be 
possible to work out a theory under which contour integrals of importance in the study of 
random graphs could be evaluated by paths that pass through the point I + fJ,, which lies 
on the "wrong side" of the quadratic singularity of T{z); 1 + /j, turns out to be a saddle 
point for several important generating functions. 
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Identity (8.15)-(8.16) suggests that the generating functions for random multigraphs 
might have interesting continued fraction forms. Such expressions could well be of special 
importance, because they often converge when power series do not. 

The fact that the recurrence for the coefficients Crd can be "solved" to yield (7.3)-(7.5) 
should prove to be a good challenge for computer systems that are now being constructed 
to solve recurrence relations automatically. The similar recurrence for the coefficients c^^, 
discussed in (7.24) and (7.25), will probably be an even greater challenge; at least, no 
simple derivation of (7.21) from (7.26) is known. 

The solution to the recurrence for e^d in section 7 relies on the introduction of a "half 
excess" stage, in which the polynomials must be evaluated at integers plus | although 
the recurrence in which they are used involves integers only. In section 20 we found, 
similarly, that it was fruitful to break the process of adding an edge into stages in which 
"half-edges" were added. Perhaps the theory of fractional differentiation will be of value 
in future investigations. However, the operators D^/'^ and -d^/"^ do not seem to transform 
the basic functions T"/ (1 — T)'' very nicely. 

Is there an equation (27.11') analogous to (27.11)? There must be a reason why the 
coefficients of (27.7') tend to have small prime factors. 

We have seen numerous examples in which the multigraph process leads to formulas 
that are mathematically cleaner than the analogous formulas for the graph process. This 
suggests that an analogous theory be introduced in place of the alternative "G„^p" model 
of random graphs: Instead of saying that each edge is present with probability p, the 
multiplicity of each edge should be allowed to have a Poisson distribution with mean p. 
Readers are encouraged to experiment with such an approach. 

Convergence to limiting distributions often appears to be monotonic. For example, 
the probability that an evolving multigraph on n vertices stays on the top line appears to 
be strictly decreasing as n increases. How could this be proved? 

Our proof of the top-line probability in Theorem 15 was independent of the difficult 
analyses in Lemma 7 and Theorem 13 about the behavior of random multigraphs with 
more than ^{n + n^/^"*"^) edges; moreover, it did not use the stopping-configuration ma- 
chinery of sections 24-26, although that theory was in fact motivated by attempts to prove 
Theorem 15 in a sharper form via generating functions. The top-line phenomenon may 
perhaps be understood more deeply if we use a generating- function-based approach, and 
the following ideas may therefore prove to be useful. Let S{w, z) be the bgf for all multi- 
graphs that never leave the top line of Figure 1, where each multigraph is weighted by the 
probability of having a purely top- line history as discussed in section 17. The discussion 
of sections 19 and 20 shows that 

S{w, z) = e^^'"''^+^^'"''^H(w, z) , (29.3) 
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where H{w,z) satisfies a differential equation almost like the equation (5.1) that defines 
E{w,z): 

^ - T^z)H = \ e-^^YH - \ e-^{dY){H - 1) . (29.4) 

The subtracted term ^e~^ {■i}1e^){H — 1) accounts for the forbidden case that a new 
edge marked by z?^ li^s entirely in the unicyclic part generated by e^; a second complex 
component arises if and only if this happens. The correction applies to i7 — 1, not H, 
because the very first complex component does not violate the top-line condition. 

Expressing H{w, z) in the form (25.16), we have Hi = Ei^ but H2 is smaller than E2: 

5 25 T3 11 it 
H2= — ^ + — ^ + — — + 



16 (1-T)6 48 {l-Tf 48 (1 - T)^ 48 {l-Tf 
In general we can write 

Hr = Y.^ra (iTr^st-. (29-5) 
for appropriate coefficients hrd- The special case // = = of (20.7) tells us that 

i?2e^ = i?2^1 - T)-i/2 = i T(l - T)-^/2 + I t2(1 - T)-9/2 ; (29.6) 

therefore we can compute the coefficients hrd by making a slight change to the rule for 
computing e^a that is expressed in (20.11): Subtract 5 from the numerator of the first 
coefficient term in (20.11), and subtract 1 from the numerator of the second coefficient. 
The first coefficient now simplifies to 

(6r - 2d + 5)(6r - 2rf + 1) - 5 _ 3r - d 
8(3r-(i + 3) ~ 2 " 

In particular, when d = we have h(^r+i)o = | ^^ro; hence h^o is the number we called kr 
in (24.3). 

Equation (25.17) now gives us a useful expression for the stopping configurations, 
V5 = e^(^'^) ^t(;'"(T?^e^)iy^_i(t(;^) 

r>2 

V (l-nwz)f^ (1-T{wz)f'^ ) 

The probability that an evolving multigraph on n vertices leaves the top line of Figure 1 
is ^„VS. 
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For fixed r we can evaluate the contribution made to ^n^S by the rth term of 
(29.7), to within 0{n~^/^), because the leading coefficient h(^r-i)o controls the asymptotic 
behavior. Indeed, we know from (25.22) and the subsequent discussion that 



wz 



,2r 



-^kui = 7^ + 0{n-'") (29.8) 

for all fixed r. Therefore when is applied to the rth term of (29.7) we get 

When r = 2, the limit is ^f', when r > 2, (7.1) and (24.3) imply that 

5kr-i ( hkr-2 \ /36(r- l)(r-2)' 



24re^ V 24(r - l)e^_i y V (6r - 1) (6r - 5) 
It follows by induction that 

hkr-i _ ^ j\ k{k + 1) _ TT^ Kk +1) yi k{k + 1) 

" 11 (/c+i)(/c+|)-ll (/c+|)(/^+i)"M 

So the sum over r is a telescoping series, 

y ^ = l-fT ^(f + l) ^ =1-^. (29.10) 

In other words, convergence to the top-line probability depends entirely on the sum over r 
of the error term in (29.9). 

The number of challenging and potentially fruitful questions that remain unanswered 
seems to be almost endless. But we shall close this list of research problems by stating what 
seems to be the single most important related area ripe for investigation at the present 
time. Wright [42] gave a procedure for computing the number of strongly connected labeled 
digraphs of excess r, analogous to his formulas for connected labeled undirected graphs. 
Random directed multigraphs are of great importance in computer applications, and it is 
shocking that so little attention has been given to their study so far. Karp [21] carried 
Wright's investigations further and discovered a beautiful theorem: A random digraph 
with n(l + ji) directed arcs almost surely has a giant strong component of size ~ 0{iJ,)^n, 
when 0{iJ,) is the factor such that an undirected graph with |n(l + //) edges almost surely 
has a giant component of size ~ 0{ii)n. (The function 0{ii) is (//+cr)/(l+//), according to 
(23.11). Karp's investigation was based on D„^p, in which every directed arc is present with 
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probability p, but a similar result surely holds for other models of random digraphs.) A 
complete analysis of the random directed multigraph process is clearly called for, preferably 
based on generating functions so that extensive quantitative information can be derived 
without difficulty. 

Here is a sketch of how such an investigation might begin. The directed multigraph 
process consists of adding directed arcs x — > y repeatedly to an initially empty multiset 
of arcs on the vertices {1, 2, . . . , n}, where x and y are independently and uniformly dis- 
tributed between 1 and n. The compensation factor k{M) of a multidigraph M with nixy 
arcs from a; to y is l/ nx=i 11^=1 "^^^y' ! '^^^ compute bivariate generating 

2 

functions as in (2.1). The bgf for all possible multidigraphs is X]n>o ^z'^/nl = G{2w, z). 

Let A be the family of all multidigraphs such that all vertices are reachable from 
vertex 1 via a directed path, and let A{w, z) be the corresponding bgf. There is a nice 
relation between A{w, z) and the bgf C{w^ z) for connected undirected multigraphs, (2.10): 
If A{w,z) = ^„>i an{w)z^/n\, we have 

V a^{w) ^ = C{w, z) . (29.11) 

n>l 

This can be proved by replacing z by ze~'^l'^ and noting that C{w^ze~'^l'^) is the bgf 
for connected multigraphs without self-loops, and by showing that all members of A are 
obtainable from such connected multigraphs M by the following reversible construction: 
Define a linear ordering -< on the vertices {1, 2, . . . , n} by saying that x -<y\i d{x) < d{y) 
or d{x) = d{y) and x < y, where d{x) is the distance from 1 to s in M. Then define a 
multidigraph D E Ahy arcs x ^ y whenever x — y in M and x -< y; include arbitrary 
additional arcs x ^ y for all pairs of vertices with x ^ y. The construction is reversible 
because d{x) is easily seen to be the distance from 1 to x in D, regardless of the choice 
of additional arcs. The additional arcs correspond to a multiplicative factor ev ^ ; = 
e" «'/2(^g«'/2~)n n-vertex multigraph, with one factor e'^ for each of the ("^^^) vertex 

pairs X y y- 

Let S be the family of all strongly connected multidigraphs, and let S{w, z) = si{w)z+ 

S2{w)z'^/2\ + sz{w)z^/3\ H be the corresponding bgf. A nontrivial identity discovered by 

Wright [40] implies that we can calculate the coefficients Sn{w) by using the formula 

where the prime in C'{w,z) denotes differentiation with respect to z. Notice that our 
generating function G{Wj z) satisfies 

G'{w, z) = e'^/^Giw, ze^) , G"{w, z) = e'''"G{w, ze^"^) , 

^("^(tf;,^) = e^'^/^G'(t(;,ze"^), (29.13) 
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thus the denominator G{w, ze~'^'^) in (29.12) is essentially an n-fold integral of G{w, z). 

Wright [42] proved that the number of strongly connected digraphs with n + r arcs 
on n vertices, disallowing self-loops and multiple arcs, is n! times a polynomial in n of 
degree 3r — 1, when n > r > 0. His proof can be adapted to multidigraphs, and everything 
becomes much simpler, just as formula (9.4) for multigraphs is simpler than formula (9.20) 
for graphs. The analogs of (2.11) and (3.4) are 

^S-i{wz) + So{wz) + wSi{wz) +w^S2{wz) + ■ ■ ■ , (29.14) 

where 

S-i(z) = z, (29.15) 

So{z) = -\n(l - z) , (29.16) 

and Sr{z) for r > 1 can easily be shown to be (1 — z)~^'^ times a polynomial in z of degree 
< 3r. For example, the multidigraphs enumerated by wSi{wz) all arise by inserting 
("uncancelling") vertices in the arcs of the reduced multidigraphs 




1 12 12 

whose generating functions are respectively ^w'^z, ^w^z'^, jw^z^. The operation of un- 
cancelling corresponds to replacing w by w/{l — wz), as in Lemma 1; so wSi{wz) = 
\w'^z/{l - wzf + \w^z'^/{l - wzf = ^w^z/{l - wz)^ and Si{z) = \z/{l - zf. 

In fact, the numerator of Sr{z) turns out to have a surprisingly small degree. Com- 
puter calculations indicate that we can write 

^r^') - (13^ + (l-^)3r-l + + (1 _ ^)r+2 ' (29-17) 

a formula analogous to (8.4), at least when r < 5. The coefficients are 
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24 
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1630711 
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2840093 


3546283 


6743 


25307 


43 
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S5d 


160 


40 


80 


120 


480 
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360 


36 


720 
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No reason why Sr{z) should have the simple form (29.17) is apparent; this phenomenon 
cries out for explanation, if it is indeed true for all r > 0, and the explanation will probably 
lead to new theorems of interest. It can be shown that this conjecture is equivalent to 
the assertion that the sum of over all labelled, reduced, strongly connected 

multidigraphs of excess r, is zero; or in other words, if we choose a labelled, reduced, 
strongly connected multidigraph of excess r at random, with probabilities weighted in the 
natural way by the compensation factor k, then the probability is | that there will be an 
even number of vertices. 

Is there a simple recurrence governing the leading coefficients Sio, S205 ■§30 5 • • • 5 perhaps 
analogous to the relation we observed for ordinary connected components in (8.5)? 

Acknowledgments. The authors wish to thank Prof. Richard Askey for helpful corre- 
spondence relating to this research. 

Appendix. Here is a list of corrections to the related paper [14]. 
Page 175, line 10: (1 + t)^ should be (1 + 1)'^ 
Page 175, line 11: (3.5) should be (3.6) 
Page 182, (4.21): should be V^t 
Page 183, fine 18: i y/M should he ^ 
Page 183, fine 24: (4.27) should be (4.25) 
Page 184, (5.6): I = 1 should he I - 1 
Page 185, line 17: 1 = 2 should he I = S 
Page 189, lines 4 and 9: i/(/ - 1) should be ^1(1 + 1) 
Page 192, (7.13): g should be ^; 2 + 3p3 should be ps 
Page 194, line 15: 'than 3?/i(A) - A - (1 - |A)(ln(l - |A) - ln(l + |A)) 

< m{X) - |A2 when' 
Page 205, line 7: delete 'number of 
Page 207, (11.9): delete commas in denominator 
Page 209, first line of (A.6): ixt - it^d should be ixt + it^S 
Page 213, the argument for enveloping series is incomplete 
Page 215, (11.12) and (11.14): delete commas in denominators 
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